Danny Chan
5c1b482a1b
[HUDI-3741] Fix flink bucket index bulk insert generates too many small files ( #5164 )
2022-03-30 08:18:36 +08:00
Danny Chan
3bf9c5ffe8
[HUDI-3728] Set the sort operator parallelism for flink bucket bulk insert ( #5154 )
2022-03-29 09:52:35 +08:00
Shawy Geng
2e2d08cb72
[HUDI-3539] Flink bucket index bucketID bootstrap optimization. ( #5093 )
...
* [HUDI-3539] Flink bucket index bucketID bootstrap optimization.
Co-authored-by: gengxiaoyu <gengxiaoyu@bytedance.com >
2022-03-28 19:50:36 +08:00
Danny Chan
4d940bbf8a
[HUDI-3716] OOM occurred when use bulk_insert cow table with flink BUCKET index ( #5135 )
2022-03-27 09:13:58 +08:00
Zhaojing Yu
483ee843e6
[HUDI-3703] Reset taskID in restoreWriteMetadata ( #5122 )
2022-03-25 10:18:28 +08:00
Danny Chan
5e86cdd1e9
[HUDI-3701] Flink bulk_insert support bucket hash index ( #5118 )
2022-03-25 09:01:42 +08:00
Danny Chan
a1c42fcc07
[minor] Checks the data block type for archived timeline ( #5106 )
2022-03-24 14:10:43 +08:00
wxp4532
26e5d2e6fc
[HUDI-3559] Flink bucket index with COW table throws NoSuchElementException
...
Actually method FlinkWriteHelper#deduplicateRecords does not guarantee the records sequence, but there is a
implicit constraint: all the records in one bucket should have the same bucket type(instant time here),
the BucketStreamWriteFunction breaks the rule and fails to comply with this constraint.
close apache/hudi#5018
2022-03-21 17:34:54 +08:00
Danny Chan
799c78e688
[HUDI-3665] Support flink multiple versions ( #5072 )
2022-03-21 10:34:50 +08:00