Danny Chan
22c45a7704
[HUDI-4188] Fix flaky ITTestDataSTreamWrite.testWriteCopyOnWrite ( #5749 )
2022-06-06 12:12:48 +08:00
Danny Chan
0d069b5e57
[HUDI-4174] Add hive conf dir option for flink sink ( #5725 )
2022-06-01 16:17:36 +08:00
Bo Cui
93fe5a497e
[HUDI-4151] flink split_reader supports rocksdb ( #5675 )
...
* [HUDI-4151] flink split_reader supports rocksdb
2022-05-28 08:37:34 +08:00
Sagar Sumit
cf837b4900
[HUDI-3193] Decouple hudi-aws from hudi-client-common ( #5666 )
...
Move HoodieMetricsCloudWatchConfig to hudi-client-common
2022-05-25 19:38:56 +05:30
喻兆靖
c20db99a7b
[HUDI-2207] Support independent flink hudi clustering function
2022-05-24 20:16:48 +08:00
YuangZhang
3ef137d156
[HUDI-4129] Initializes a new fs view for WriteProfile#reload ( #5640 )
...
Co-authored-by: zhangyuang <zhangyuang@corp.netease.com >
2022-05-23 09:57:34 +08:00
Danny Chan
c7576f7613
[HUDI-4130] Remove the upgrade/downgrade for flink #initTable ( #5642 )
2022-05-20 21:31:23 +08:00
aliceyyan
1da0b21edd
[HUDI-4119] the first read result is incorrect when Flink upsert- Kafka connector is used in HUDi ( #5626 )
...
* HUDI-4119 the first read result is incorrect when Flink upsert- Kafka connector is used in HUDi
Co-authored-by: aliceyyan <aliceyyan@tencent.com >
2022-05-20 18:10:24 +08:00
Danny Chan
551aa959c5
Revert "[HUDI-3870] Add timeout rollback for flink online compaction ( #5314 )" ( #5622 )
...
This reverts commit 6f9b02decb .
2022-05-18 20:30:54 +08:00
luokey
a1017c66aa
Clean the marker files for flink compaction ( #5611 )
...
Co-authored-by: 854194341@qq.com <loukey_7821>
2022-05-18 11:21:14 +08:00
Danny Chan
ebbe56e862
[minor] Some code refactoring for LogFileComparator and Instant instantiation ( #5600 )
2022-05-18 09:30:09 +08:00
Danny Chan
d52d13302d
[HUDI-4101] BucketIndexPartitioner should take partition path for better dispersion ( #5590 )
2022-05-17 10:34:57 +08:00
Danny Chan
fdd96cc97e
[HUDI-4104] DeltaWriteProfile includes the pending compaction file slice when deciding small buckets ( #5594 )
2022-05-17 10:34:15 +08:00
Danny Chan
43e08193ef
[HUDI-4098] Metadata table heartbeat for instant has expired, last heartbeat 0 ( #5583 )
2022-05-16 17:40:08 +08:00
Bo Cui
a704e3740c
[HUDI-3336][HUDI-FLINK]Support custom hadoop config for flink ( #5574 )
...
* [HUDI-3336][HUDI-FLINK]Support custom hadoop config for flink
2022-05-13 19:52:55 +08:00
Bo Cui
7fb436d3cf
[HUDI-4078][HUDI-FLINK]BootstrapOperator contains the pending compact… ( #5545 )
...
* [HUDI-4078][HUDI-FLINK]BootstrapOperator contains the pending compaction files
2022-05-13 14:32:48 +08:00
Bo Cui
701f8c039d
[HUDI-3336][HUDI-FLINK]Support custom hadoop config for flink ( #5528 )
...
* [HUDI-3336][HUDI-FLINK]Support custom hadoop config for flink
2022-05-13 09:50:11 +08:00
aliceyyan
6fd21d0f10
[HUDI-4044] When reading data from flink-hudi to external storage, the … ( #5516 )
...
Co-authored-by: aliceyyan <aliceyyan@tencent.com >
2022-05-10 10:25:13 +08:00
xicm
6b47ef6ed2
[HUDI-4053] Flaky ITTestHoodieDataSource.testStreamWriteBatchReadOpti… ( #5526 )
...
* [HUDI-4053] Flaky ITTestHoodieDataSource.testStreamWriteBatchReadOptimized
Co-authored-by: xicm <xicm@asiainfo.com >
2022-05-09 16:35:50 +08:00
ForwardXu
4c70840275
[MINOR] Fixing close for HoodieCatalog's test ( #5531 )
...
* [MINOR] Fixing close for HoodieCatalog's test
2022-05-09 15:17:24 +08:00
Wangyh
33ff4752ba
[HUDI-3978] Fix use of partition path field as hive partition field in flink ( #5434 )
...
* Fix partition path fields as hive sync partition fields error
2022-04-29 20:58:54 -07:00
吴祥平
e421d536ea
[HUDI-3758] Fix duplicate fileId error in MOR table type with flink bucket hash Index ( #5185 )
...
* fix duplicate fileId with bucket Index
* replace to load FileGroup from FileSystemView
2022-04-29 14:10:20 +08:00
Gary Li
b27e8b51d8
[MINOR] support different cleaning policy for flink ( #5459 )
2022-04-29 09:48:44 +08:00
LiChuang
4e928a6fe1
[HUDI-3943] Some description fixes for 0.10.1 docs ( #5447 )
2022-04-28 15:18:56 -07:00
Ibson
52953c8f5e
[HUDI-3815] Fix docs description of metadata.compaction.delta_commits default value error ( #5368 )
...
Co-authored-by: pusheng.li01 <pusheng.li01@liulishuo.com >
2022-04-27 16:09:44 -07:00
Danny Chan
e1ccf2e00b
[HUDI-3977] Flink hudi table with date type partition path throws HoodieNotSupportedException ( #5432 )
2022-04-27 13:19:55 +08:00
ForwardXu
9054b85961
Revert "[HUDI-3951]support generan parameter 'sink.parallelism' for flink-hudi ( #5405 )" ( #5421 )
...
This reverts commit bda3db078e .
2022-04-25 12:58:27 +08:00
Ruguo Yu
d994c58cc0
[HUDI-3946] Validate option path in flink hudi sink ( #5397 )
2022-04-25 10:13:47 +08:00
hehuiyuan
bda3db078e
support generan parameter 'sink.parallelism' for flink-hudi ( #5405 )
...
Co-authored-by: hehuiyuan1 <hehuiyuan@jd.com >
2022-04-24 19:09:39 +08:00
吴祥平
408663c42b
[HUDI-3912] Fix lose data when rollback in flink async compact ( #5357 )
...
* stop add event when has failed compact event
Co-authored-by: wxp <wxp4532@outlook.com >
2022-04-20 19:23:39 +08:00
Danny Chan
7a9e411e9d
[HUDI-3917] Flink write task hangs if last checkpoint has no data input ( #5360 )
2022-04-20 12:48:24 +08:00
董可伦
b8e465fdfc
[MINOR] Fix typos in log4j-surefire.properties ( #5212 )
2022-04-15 13:33:37 -07:00
Danny Chan
6f9b02decb
[HUDI-3870] Add timeout rollback for flink online compaction ( #5314 )
2022-04-13 20:05:48 +08:00
Danny Chan
0281725c6b
[MINOR] Inline the partition path logic into the builder ( #5310 )
2022-04-13 16:54:39 +05:30
Danny Chan
43de2b4702
[HUDI-3868] Disable the sort input for flink streaming append mode ( #5309 )
2022-04-13 14:21:08 +08:00
Sagar Sumit
df87095ef0
[HUDI-3454] Fix partition name in all code paths for LogRecordScanner ( #5252 )
...
* Depend on FSUtils#getRelativePartitionPath(basePath, logFilePath.getParent)
to get the partition.
* If the list of log file paths in the split is empty, then fallback to usual behaviour.
2022-04-08 09:59:36 +05:30
xiarixiaoyao
531381faff
[HUDI-3096] fixed the bug that the cow table(contains decimalType) write by flink cannot be read by spark. ( #4421 )
2022-04-07 17:21:25 +08:00
Danny Chan
e33149be9a
[HUDI-3808] Flink bulk_insert timestamp(3) can not be read by Spark ( #5236 )
2022-04-07 15:17:39 +08:00
Raymond Xu
e96f08f355
Moving to 0.12.0-SNAPSHOT on master branch.
2022-04-06 15:24:10 +08:00
todd5167
eef3f9c74a
[HUDI-3771] flink supports sync table information to aws glue ( #5202 )
2022-04-02 21:16:10 +08:00
Bo Cui
17d11f4839
[MINOR] Repeated execution of update status ( #5089 )
2022-03-30 17:30:06 -04:00
Danny Chan
b9fbada2f2
[minor] Follow 3178, fix the flink metadata table compaction ( #5175 )
2022-03-30 20:45:29 +08:00
Danny Chan
5c1b482a1b
[HUDI-3741] Fix flink bucket index bulk insert generates too many small files ( #5164 )
2022-03-30 08:18:36 +08:00
Danny Chan
3bf9c5ffe8
[HUDI-3728] Set the sort operator parallelism for flink bucket bulk insert ( #5154 )
2022-03-29 09:52:35 +08:00
Shawy Geng
2e2d08cb72
[HUDI-3539] Flink bucket index bucketID bootstrap optimization. ( #5093 )
...
* [HUDI-3539] Flink bucket index bucketID bootstrap optimization.
Co-authored-by: gengxiaoyu <gengxiaoyu@bytedance.com >
2022-03-28 19:50:36 +08:00
Danny Chan
4d940bbf8a
[HUDI-3716] OOM occurred when use bulk_insert cow table with flink BUCKET index ( #5135 )
2022-03-27 09:13:58 +08:00
Zhaojing Yu
483ee843e6
[HUDI-3703] Reset taskID in restoreWriteMetadata ( #5122 )
2022-03-25 10:18:28 +08:00
Danny Chan
5e86cdd1e9
[HUDI-3701] Flink bulk_insert support bucket hash index ( #5118 )
2022-03-25 09:01:42 +08:00
Danny Chan
a1c42fcc07
[minor] Checks the data block type for archived timeline ( #5106 )
2022-03-24 14:10:43 +08:00
wxp4532
26e5d2e6fc
[HUDI-3559] Flink bucket index with COW table throws NoSuchElementException
...
Actually method FlinkWriteHelper#deduplicateRecords does not guarantee the records sequence, but there is a
implicit constraint: all the records in one bucket should have the same bucket type(instant time here),
the BucketStreamWriteFunction breaks the rule and fails to comply with this constraint.
close apache/hudi#5018
2022-03-21 17:34:54 +08:00