1
0
Commit Graph

76 Commits

Author SHA1 Message Date
Danny Chan
a998586396 [minor] following 4152, refactor the clazz about plan selection strategy (#6060) 2022-07-08 09:56:10 +08:00
Danny Chan
c744848c59 [HUDI-4366] Synchronous cleaning for flink bounded source (#6051) 2022-07-08 09:55:07 +08:00
e74ad324c3 [HUDI-4152] Flink offline compaction support compacting multi compaction plan at once (#5677)
* [HUDI-4152] Flink offline compaction allow compact multi compaction plan at once

* [HUDI-4152] Fix exception for duplicated uid when multi compaction plan are compacted

* [HUDI-4152] Provider UT & IT for compact multi compaction plan

* [HUDI-4152] Put multi compaction plans into one compaction plan source

* [HUDI-4152] InstantCompactionPlanSelectStrategy allow multi instant by using comma

* [HUDI-4152] Add IT for InstantCompactionPlanSelectStrategy
2022-07-07 14:11:26 +08:00
Danny Chan
7eeaff9ee0 [HUDI-4357] Support flink 1.15.x (#6050) 2022-07-06 13:42:58 +08:00
Shiyan Xu
c0e1587966 [HUDI-3730] Improve meta sync class design and hierarchies (#5854)
* [HUDI-3730] Improve meta sync class design and hierarchies (#5754)
* Implements class design proposed in RFC-55

Co-authored-by: jian.feng <fengjian428@gmial.com>
Co-authored-by: jian.feng <jian.feng@shopee.com>
2022-07-03 14:47:25 +05:30
Danny Chan
47792a3186 [HUDI-4353] Column stats data skipping for flink (#6026) 2022-07-03 08:29:31 +08:00
JerryYue-M
bdf73b2650 [HUDI-3953]Flink Hudi module should support low-level source and sink api (#5445)
Co-authored-by: jerryyue <jerryyue@didiglobal.com>
2022-07-02 08:38:46 +08:00
BruceLin
efb9719018 [HUDI-4332] The current instant may be wrong under some extreme conditions in AppendWriteFunction. (#5988) 2022-06-28 20:42:26 +08:00
吴祥平
3a1fd22841 [HUDI-4311] Fix Flink lose data on some rollback scene (#5950) 2022-06-27 16:09:44 +08:00
cxzl25
72fa19bcc9 [HUDI-4316] Support for spillable diskmap configuration when constructing HoodieMergedLogRecordScanner (#5959) 2022-06-27 11:09:30 +08:00
luokey
59978ef4a9 [HUDI-4260] Change KEYGEN_CLASS_NAME without default value (#5877)
* Change KEYGEN_CLASS_NAME without default value

Co-authored-by: 854194341@qq.com <loukey_7821>
2022-06-24 15:05:03 +08:00
Zhaojing Yu
6456bd3a51 [HUDI-4273] Support inline schedule clustering for Flink stream (#5890)
* [HUDI-4273] Support inline schedule clustering for Flink stream

* delete deprecated clustering plan strategy and add clustering ITTest
2022-06-24 11:28:06 +08:00
Danny Chan
1dbd9d407a [minor] following 4270, add unit tests for the keys lost case (#5918) 2022-06-22 16:56:06 +08:00
Bo Cui
7c4aaa9715 [HUDI-4270] Bootstrap op data loading missing (#5888) 2022-06-21 11:47:39 +08:00
Alexander Trushev
f1103281d2 [HUDI-4258] Fix when HoodieTable removes data file before the end of Flink job (#5876)
* [HUDI-4258] Fix when HoodieTable removes data file before the end of Flink job
2022-06-20 17:07:49 +08:00
luokey
7c6bedff25 [HUDI-4259] Flink create avro schema not conformance to standards (#5878)
* flink create avro schema not conformance to standards

Co-authored-by: 854194341@qq.com <loukey_7821>
2022-06-20 15:41:23 +08:00
Shizhi Chen
7481eacf23 [HUDI-4277] supoort flink table source with computed column (#5897)
Co-authored-by: chenshizhi <chenshizhi@bilibili.com>
2022-06-20 15:19:32 +08:00
5herhom
efafb79eeb [MINOR] Add "spillable_map_path" in FlinkCompactionConfig. To avoid the disk space of "/tmp" full when compacting offline. (#5905) 2022-06-20 15:15:23 +08:00
huberylee
d4f0326b4b [HUDI-4275] Refactor rollback inflight instant for clustering/compaction to reuse some code (#5894) 2022-06-20 14:29:21 +08:00
superche
14d8735a1c Strip extra spaces when creating new configuration (#5849)
Co-authored-by: superche <superche@tencent.com>
2022-06-13 19:10:38 +08:00
sandyfog
c82e3462e3 [MINOR] fix AvroSchemaConverter duplicate branch in 'switch' (#5813) 2022-06-13 10:55:24 +08:00
Shiyan Xu
5aaac21d1d [HUDI-4224] Fix CI issues (#5842)
- Upgrade junit to 5.7.2
- Downgrade surefire and failsafe to 2.22.2
- Fix test failures that were previously not reported
- Improve azure pipeline configs

Co-authored-by: liujinhui1994 <965147871@qq.com>
Co-authored-by: Y Ethan Guo <ethan.guoyihua@gmail.com>
2022-06-12 11:44:18 -07:00
yanenze
ba47904fa2 [HUDI-4139]improvement for flink write operator name to identify tables easily (#5744)
Co-authored-by: yanenze <yanenze@keytop.com.cn>
2022-06-09 17:48:20 -04:00
sandyfog
8ff17b0470 [MINOR] FlinkStateBackendConverter add more exception message (#5809)
* [MINOR] FlinkStateBackendConverter add more  exception message
2022-06-09 15:13:27 +08:00
HunterXHunter
132c0aa8c7 [HUDI-4101] When BucketIndexPartitioner take partition path for dispersion may cause the fileID of the task to not be loaded correctly (#5763)
Co-authored-by: john.wick <john.wick@vipshop.com>
2022-06-06 21:53:55 +08:00
Danny Chan
22c45a7704 [HUDI-4188] Fix flaky ITTestDataSTreamWrite.testWriteCopyOnWrite (#5749) 2022-06-06 12:12:48 +08:00
Danny Chan
0d069b5e57 [HUDI-4174] Add hive conf dir option for flink sink (#5725) 2022-06-01 16:17:36 +08:00
Bo Cui
93fe5a497e [HUDI-4151] flink split_reader supports rocksdb (#5675)
* [HUDI-4151] flink split_reader supports rocksdb
2022-05-28 08:37:34 +08:00
Sagar Sumit
cf837b4900 [HUDI-3193] Decouple hudi-aws from hudi-client-common (#5666)
Move HoodieMetricsCloudWatchConfig to hudi-client-common
2022-05-25 19:38:56 +05:30
喻兆靖
c20db99a7b [HUDI-2207] Support independent flink hudi clustering function 2022-05-24 20:16:48 +08:00
YuangZhang
3ef137d156 [HUDI-4129] Initializes a new fs view for WriteProfile#reload (#5640)
Co-authored-by: zhangyuang <zhangyuang@corp.netease.com>
2022-05-23 09:57:34 +08:00
Danny Chan
c7576f7613 [HUDI-4130] Remove the upgrade/downgrade for flink #initTable (#5642) 2022-05-20 21:31:23 +08:00
aliceyyan
1da0b21edd [HUDI-4119] the first read result is incorrect when Flink upsert- Kafka connector is used in HUDi (#5626)
* HUDI-4119 the first read result is incorrect when Flink upsert- Kafka connector is used in HUDi

Co-authored-by: aliceyyan <aliceyyan@tencent.com>
2022-05-20 18:10:24 +08:00
Danny Chan
551aa959c5 Revert "[HUDI-3870] Add timeout rollback for flink online compaction (#5314)" (#5622)
This reverts commit 6f9b02decb.
2022-05-18 20:30:54 +08:00
luokey
a1017c66aa Clean the marker files for flink compaction (#5611)
Co-authored-by: 854194341@qq.com <loukey_7821>
2022-05-18 11:21:14 +08:00
Danny Chan
ebbe56e862 [minor] Some code refactoring for LogFileComparator and Instant instantiation (#5600) 2022-05-18 09:30:09 +08:00
Danny Chan
d52d13302d [HUDI-4101] BucketIndexPartitioner should take partition path for better dispersion (#5590) 2022-05-17 10:34:57 +08:00
Danny Chan
fdd96cc97e [HUDI-4104] DeltaWriteProfile includes the pending compaction file slice when deciding small buckets (#5594) 2022-05-17 10:34:15 +08:00
Danny Chan
43e08193ef [HUDI-4098] Metadata table heartbeat for instant has expired, last heartbeat 0 (#5583) 2022-05-16 17:40:08 +08:00
Bo Cui
a704e3740c [HUDI-3336][HUDI-FLINK]Support custom hadoop config for flink (#5574)
* [HUDI-3336][HUDI-FLINK]Support custom hadoop config for flink
2022-05-13 19:52:55 +08:00
Bo Cui
7fb436d3cf [HUDI-4078][HUDI-FLINK]BootstrapOperator contains the pending compact… (#5545)
* [HUDI-4078][HUDI-FLINK]BootstrapOperator contains the pending compaction files
2022-05-13 14:32:48 +08:00
Bo Cui
701f8c039d [HUDI-3336][HUDI-FLINK]Support custom hadoop config for flink (#5528)
* [HUDI-3336][HUDI-FLINK]Support custom hadoop config for flink
2022-05-13 09:50:11 +08:00
aliceyyan
6fd21d0f10 [HUDI-4044] When reading data from flink-hudi to external storage, the … (#5516)
Co-authored-by: aliceyyan <aliceyyan@tencent.com>
2022-05-10 10:25:13 +08:00
xicm
6b47ef6ed2 [HUDI-4053] Flaky ITTestHoodieDataSource.testStreamWriteBatchReadOpti… (#5526)
* [HUDI-4053] Flaky ITTestHoodieDataSource.testStreamWriteBatchReadOptimized

Co-authored-by: xicm <xicm@asiainfo.com>
2022-05-09 16:35:50 +08:00
ForwardXu
4c70840275 [MINOR] Fixing close for HoodieCatalog's test (#5531)
* [MINOR] Fixing close for HoodieCatalog's test
2022-05-09 15:17:24 +08:00
Wangyh
33ff4752ba [HUDI-3978] Fix use of partition path field as hive partition field in flink (#5434)
* Fix partition path fields as hive sync partition fields error
2022-04-29 20:58:54 -07:00
吴祥平
e421d536ea [HUDI-3758] Fix duplicate fileId error in MOR table type with flink bucket hash Index (#5185)
* fix duplicate fileId with bucket Index
* replace to load FileGroup from FileSystemView
2022-04-29 14:10:20 +08:00
Gary Li
b27e8b51d8 [MINOR] support different cleaning policy for flink (#5459) 2022-04-29 09:48:44 +08:00
LiChuang
4e928a6fe1 [HUDI-3943] Some description fixes for 0.10.1 docs (#5447) 2022-04-28 15:18:56 -07:00
Ibson
52953c8f5e [HUDI-3815] Fix docs description of metadata.compaction.delta_commits default value error (#5368)
Co-authored-by: pusheng.li01 <pusheng.li01@liulishuo.com>
2022-04-27 16:09:44 -07:00