1
0
Commit Graph

50 Commits

Author SHA1 Message Date
Danny Chan
0d069b5e57 [HUDI-4174] Add hive conf dir option for flink sink (#5725) 2022-06-01 16:17:36 +08:00
Bo Cui
93fe5a497e [HUDI-4151] flink split_reader supports rocksdb (#5675)
* [HUDI-4151] flink split_reader supports rocksdb
2022-05-28 08:37:34 +08:00
Sagar Sumit
cf837b4900 [HUDI-3193] Decouple hudi-aws from hudi-client-common (#5666)
Move HoodieMetricsCloudWatchConfig to hudi-client-common
2022-05-25 19:38:56 +05:30
喻兆靖
c20db99a7b [HUDI-2207] Support independent flink hudi clustering function 2022-05-24 20:16:48 +08:00
YuangZhang
3ef137d156 [HUDI-4129] Initializes a new fs view for WriteProfile#reload (#5640)
Co-authored-by: zhangyuang <zhangyuang@corp.netease.com>
2022-05-23 09:57:34 +08:00
Danny Chan
c7576f7613 [HUDI-4130] Remove the upgrade/downgrade for flink #initTable (#5642) 2022-05-20 21:31:23 +08:00
aliceyyan
1da0b21edd [HUDI-4119] the first read result is incorrect when Flink upsert- Kafka connector is used in HUDi (#5626)
* HUDI-4119 the first read result is incorrect when Flink upsert- Kafka connector is used in HUDi

Co-authored-by: aliceyyan <aliceyyan@tencent.com>
2022-05-20 18:10:24 +08:00
Danny Chan
551aa959c5 Revert "[HUDI-3870] Add timeout rollback for flink online compaction (#5314)" (#5622)
This reverts commit 6f9b02decb.
2022-05-18 20:30:54 +08:00
luokey
a1017c66aa Clean the marker files for flink compaction (#5611)
Co-authored-by: 854194341@qq.com <loukey_7821>
2022-05-18 11:21:14 +08:00
Danny Chan
ebbe56e862 [minor] Some code refactoring for LogFileComparator and Instant instantiation (#5600) 2022-05-18 09:30:09 +08:00
Danny Chan
d52d13302d [HUDI-4101] BucketIndexPartitioner should take partition path for better dispersion (#5590) 2022-05-17 10:34:57 +08:00
Danny Chan
fdd96cc97e [HUDI-4104] DeltaWriteProfile includes the pending compaction file slice when deciding small buckets (#5594) 2022-05-17 10:34:15 +08:00
Danny Chan
43e08193ef [HUDI-4098] Metadata table heartbeat for instant has expired, last heartbeat 0 (#5583) 2022-05-16 17:40:08 +08:00
Bo Cui
a704e3740c [HUDI-3336][HUDI-FLINK]Support custom hadoop config for flink (#5574)
* [HUDI-3336][HUDI-FLINK]Support custom hadoop config for flink
2022-05-13 19:52:55 +08:00
Bo Cui
7fb436d3cf [HUDI-4078][HUDI-FLINK]BootstrapOperator contains the pending compact… (#5545)
* [HUDI-4078][HUDI-FLINK]BootstrapOperator contains the pending compaction files
2022-05-13 14:32:48 +08:00
Bo Cui
701f8c039d [HUDI-3336][HUDI-FLINK]Support custom hadoop config for flink (#5528)
* [HUDI-3336][HUDI-FLINK]Support custom hadoop config for flink
2022-05-13 09:50:11 +08:00
aliceyyan
6fd21d0f10 [HUDI-4044] When reading data from flink-hudi to external storage, the … (#5516)
Co-authored-by: aliceyyan <aliceyyan@tencent.com>
2022-05-10 10:25:13 +08:00
xicm
6b47ef6ed2 [HUDI-4053] Flaky ITTestHoodieDataSource.testStreamWriteBatchReadOpti… (#5526)
* [HUDI-4053] Flaky ITTestHoodieDataSource.testStreamWriteBatchReadOptimized

Co-authored-by: xicm <xicm@asiainfo.com>
2022-05-09 16:35:50 +08:00
ForwardXu
4c70840275 [MINOR] Fixing close for HoodieCatalog's test (#5531)
* [MINOR] Fixing close for HoodieCatalog's test
2022-05-09 15:17:24 +08:00
Wangyh
33ff4752ba [HUDI-3978] Fix use of partition path field as hive partition field in flink (#5434)
* Fix partition path fields as hive sync partition fields error
2022-04-29 20:58:54 -07:00
吴祥平
e421d536ea [HUDI-3758] Fix duplicate fileId error in MOR table type with flink bucket hash Index (#5185)
* fix duplicate fileId with bucket Index
* replace to load FileGroup from FileSystemView
2022-04-29 14:10:20 +08:00
Gary Li
b27e8b51d8 [MINOR] support different cleaning policy for flink (#5459) 2022-04-29 09:48:44 +08:00
LiChuang
4e928a6fe1 [HUDI-3943] Some description fixes for 0.10.1 docs (#5447) 2022-04-28 15:18:56 -07:00
Ibson
52953c8f5e [HUDI-3815] Fix docs description of metadata.compaction.delta_commits default value error (#5368)
Co-authored-by: pusheng.li01 <pusheng.li01@liulishuo.com>
2022-04-27 16:09:44 -07:00
Danny Chan
e1ccf2e00b [HUDI-3977] Flink hudi table with date type partition path throws HoodieNotSupportedException (#5432) 2022-04-27 13:19:55 +08:00
ForwardXu
9054b85961 Revert "[HUDI-3951]support generan parameter 'sink.parallelism' for flink-hudi (#5405)" (#5421)
This reverts commit bda3db078e.
2022-04-25 12:58:27 +08:00
Ruguo Yu
d994c58cc0 [HUDI-3946] Validate option path in flink hudi sink (#5397) 2022-04-25 10:13:47 +08:00
hehuiyuan
bda3db078e support generan parameter 'sink.parallelism' for flink-hudi (#5405)
Co-authored-by: hehuiyuan1 <hehuiyuan@jd.com>
2022-04-24 19:09:39 +08:00
吴祥平
408663c42b [HUDI-3912] Fix lose data when rollback in flink async compact (#5357)
* stop add event when has failed compact event

Co-authored-by: wxp <wxp4532@outlook.com>
2022-04-20 19:23:39 +08:00
Danny Chan
7a9e411e9d [HUDI-3917] Flink write task hangs if last checkpoint has no data input (#5360) 2022-04-20 12:48:24 +08:00
董可伦
b8e465fdfc [MINOR] Fix typos in log4j-surefire.properties (#5212) 2022-04-15 13:33:37 -07:00
Danny Chan
6f9b02decb [HUDI-3870] Add timeout rollback for flink online compaction (#5314) 2022-04-13 20:05:48 +08:00
Danny Chan
0281725c6b [MINOR] Inline the partition path logic into the builder (#5310) 2022-04-13 16:54:39 +05:30
Danny Chan
43de2b4702 [HUDI-3868] Disable the sort input for flink streaming append mode (#5309) 2022-04-13 14:21:08 +08:00
Sagar Sumit
df87095ef0 [HUDI-3454] Fix partition name in all code paths for LogRecordScanner (#5252)
* Depend on FSUtils#getRelativePartitionPath(basePath, logFilePath.getParent) 
to get the partition.

* If the list of log file paths in the split is empty, then fallback to usual behaviour.
2022-04-08 09:59:36 +05:30
xiarixiaoyao
531381faff [HUDI-3096] fixed the bug that the cow table(contains decimalType) write by flink cannot be read by spark. (#4421) 2022-04-07 17:21:25 +08:00
Danny Chan
e33149be9a [HUDI-3808] Flink bulk_insert timestamp(3) can not be read by Spark (#5236) 2022-04-07 15:17:39 +08:00
Raymond Xu
e96f08f355 Moving to 0.12.0-SNAPSHOT on master branch. 2022-04-06 15:24:10 +08:00
todd5167
eef3f9c74a [HUDI-3771] flink supports sync table information to aws glue (#5202) 2022-04-02 21:16:10 +08:00
Bo Cui
17d11f4839 [MINOR] Repeated execution of update status (#5089) 2022-03-30 17:30:06 -04:00
Danny Chan
b9fbada2f2 [minor] Follow 3178, fix the flink metadata table compaction (#5175) 2022-03-30 20:45:29 +08:00
Danny Chan
5c1b482a1b [HUDI-3741] Fix flink bucket index bulk insert generates too many small files (#5164) 2022-03-30 08:18:36 +08:00
Danny Chan
3bf9c5ffe8 [HUDI-3728] Set the sort operator parallelism for flink bucket bulk insert (#5154) 2022-03-29 09:52:35 +08:00
Shawy Geng
2e2d08cb72 [HUDI-3539] Flink bucket index bucketID bootstrap optimization. (#5093)
* [HUDI-3539] Flink bucket index bucketID bootstrap optimization.

Co-authored-by: gengxiaoyu <gengxiaoyu@bytedance.com>
2022-03-28 19:50:36 +08:00
Danny Chan
4d940bbf8a [HUDI-3716] OOM occurred when use bulk_insert cow table with flink BUCKET index (#5135) 2022-03-27 09:13:58 +08:00
Zhaojing Yu
483ee843e6 [HUDI-3703] Reset taskID in restoreWriteMetadata (#5122) 2022-03-25 10:18:28 +08:00
Danny Chan
5e86cdd1e9 [HUDI-3701] Flink bulk_insert support bucket hash index (#5118) 2022-03-25 09:01:42 +08:00
Danny Chan
a1c42fcc07 [minor] Checks the data block type for archived timeline (#5106) 2022-03-24 14:10:43 +08:00
wxp4532
26e5d2e6fc [HUDI-3559] Flink bucket index with COW table throws NoSuchElementException
Actually method FlinkWriteHelper#deduplicateRecords does not guarantee the records sequence, but there is a
implicit constraint: all the records in one bucket should have the same bucket type(instant time here),
the BucketStreamWriteFunction breaks the rule and fails to comply with this constraint.

close apache/hudi#5018
2022-03-21 17:34:54 +08:00
Danny Chan
799c78e688 [HUDI-3665] Support flink multiple versions (#5072) 2022-03-21 10:34:50 +08:00