Danny Chan
465d553df8
[HUDI-3600] Tweak the default cleaning strategy to be more streaming friendly for flink ( #5010 )
2022-03-14 14:22:07 +08:00
Danny Chan
ec24407191
[HUDI-3581] Reorganize some clazz for hudi flink ( #4983 )
2022-03-10 15:55:15 +08:00
Danny Chan
fe53bd2dea
[HUDI-2677] Add DFS based message queue for flink writer[part3] ( #4961 )
2022-03-08 15:43:21 +08:00
Bo
b6bdb46f7f
[MINOR][HUDI-3460]Fix HoodieDataSourceITCase
...
close #4959
2022-03-08 12:18:43 +08:00
todd5167
34bc752853
[HUDI-3573] flink cleanFuntion execute clean on initialization ( #4936 )
...
For flink insert overwrite operation, do the cleaning each time before the write.
2022-03-08 11:53:54 +08:00
Sivabalan Narayanan
6a46130037
[HUDI-2761] Fixing timeline server for repeated refreshes ( #4812 )
...
* Fixing timeline server for repeated refreshes
2022-03-05 10:04:16 +08:00
Bo Cui
0986d5a01d
[HUDI-3460] Add reader merge memory option for flink ( #4911 )
...
* flink TM memory Optimization
2022-03-04 19:29:29 +08:00
Danny Chan
1d57bd17c2
[minor] Cosmetic changes following HUDI-3315 ( #4934 )
2022-03-02 17:44:52 +08:00
Gary Li
10d866f083
[HUDI-3315] RFC-35 Part-1 Support bucket index in Flink writer ( #4679 )
...
* Support bucket index in Flink writer
* Use record key as default index key
2022-03-02 15:14:44 +08:00
yuzhaojing
3b2da9f138
[HUDI-2631] In CompactFunction, set up the write schema each time with the latest schema ( #4000 )
...
Co-authored-by: yuzhaojing <yuzhaojing@bytedance.com >
2022-03-02 11:18:17 +08:00
stayrascal
3cfb52c413
[MINOR] fix get builtin function issue from Hudi catalog ( #4917 )
2022-03-02 11:16:19 +08:00
yuzhaojing
44b8ab6048
[HUDI-3418] Save timeout option for remote RemoteFileSystemView ( #4809 )
...
Co-authored-by: yuzhaojing <yuzhaojing@bytedance.com >
2022-02-28 15:16:40 -05:00
Bo Cui
193215201c
[MINOR] Change MINI_BATCH_SIZE to 2048 ( #4862 )
...
ParquetColumnarRowSplitReader#batchSize is 2048, so Changing MINI_BATCH_SIZE to 2048 will reduce memory cache.
2022-02-28 10:45:28 +08:00
Raymond Xu
c77b2591d0
[HUDI-2439] Remove SparkBoundedInMemoryExecutor ( #4860 )
2022-02-26 08:02:12 -05:00
Danny Chan
a4ee7463ae
[HUDI-3474] Add more document to Pipelines for the usage of this tool to build a write pipeline ( #4906 )
2022-02-25 19:08:51 +08:00
yanenze
943b99775b
[HUDI-3488] The flink small file list should exclude file slices with pending compaction ( #4893 )
...
# this happens when the async-compaction has been configured
Co-authored-by: yanenze <yanenze@keytop.com.cn >
2022-02-24 14:45:03 +08:00
Danny Chan
4affdd0c8f
[HUDI-3461] The archived timeline for flink streaming reader should not be reused ( #4861 )
...
* Before the patch, the flink streaming reader caches the meta client thus the archived timeline,
when fetching the instant details from the reused timeline, the exception throws
* Add a method in HoodieTableMetaClient to return a fresh new archived timeline each time
2022-02-22 15:54:29 +08:00
Bo Cui
83279971a1
[HUDI-3446] Supports batch reader in BootstrapOperator#loadRecords ( #4837 )
...
* [HUDI-3446] Supports batch Reader in BootstrapOperator#loadRecords
2022-02-19 21:21:48 +08:00
stayrascal
f15125c0cd
[HUDI-3389] fix ColumnarArrayData ClassCastException issue ( #4842 )
...
* [HUDI-3389] fix ColumnarArrayData ClassCastException issue
* [HUDI-3389] remove MapColumnVector.java, RowColumnVector.java, and add test case for array<int> field
2022-02-19 10:56:41 +08:00
RexAn
5009138d04
[HUDI-3438] Avoid getSmallFiles if hoodie.parquet.small.file.limit is 0 ( #4823 )
...
Co-authored-by: Hui An <hui.an@shopee.com >
2022-02-18 08:57:04 -05:00
zhangxiang17
433c2573ef
[HUDI-3442]Duplicate code calls for 'FlinkOptions.flatOptions' ( #4832 )
2022-02-17 11:04:09 +08:00
Alexey Kudinkin
aaddaf524a
[HUDI-3280] Cleaning up Hive-related hierarchies after refactoring ( #4743 )
2022-02-16 15:36:37 -08:00
Raymond Xu
538ec44fa8
[HUDI-2931] Add config to disable table services ( #4777 )
2022-02-15 09:49:53 -05:00
Yann Byron
cb6ca7f0d1
[HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re… ( #4714 )
2022-02-14 23:38:38 -05:00
YueZhang
76e2faa28d
[HUDI-3370] The files recorded in the commit may not match the actual ones for MOR Compaction ( #4753 )
...
* use HoodieCommitMetadata to replace writeStatuses computation
Co-authored-by: yuezhang <yuezhang@freewheel.tv >
2022-02-14 11:12:52 +08:00
Danny Chan
b3b44236fe
[HUDI-3389] Bump flink version to 1.14.3 ( #4776 )
2022-02-10 11:32:01 +08:00
ForwardXu
773b317983
[HUDI-2941] Show _hoodie_operation in spark sql results ( #4649 )
2022-02-07 06:28:13 -08:00
Y Ethan Guo
b8601a9f58
[HUDI-2656] Generalize HoodieIndex for flexible record data type ( #3893 )
...
Co-authored-by: Raymond Xu <2701446+xushiyan@users.noreply.github.com >
2022-02-03 20:24:04 -08:00
todd5167
2969fb3835
[HUDI-3233] Make metadata commit synchronous for flink batch
...
close apache/hudi#4561
2022-01-12 20:22:53 +08:00
Town
4b0111974f
[HUDI-3184] hudi-flink support timestamp-micros ( #4548 )
...
* support both avro and parquet code path
* string rowdata conversion is also supported
2022-01-12 10:53:51 +08:00
Sagar Sumit
827549949c
[HUDI-2909] Handle logical type in TimestampBasedKeyGenerator ( #4203 )
...
* [HUDI-2909] Handle logical type in TimestampBasedKeyGenerator
Timestampbased key generator was returning diff values for row writer and non row writer path. this patch fixes it and is guarded by a config flag (`hoodie.datasource.write.keygenerator.consistent.logical.timestamp.enabled`)
2022-01-08 10:22:44 -05:00
fengli
205e48f53f
[HUDI-3132] Minor fixes for HoodieCatalog
...
close apache/hudi#4486
2022-01-06 11:17:23 +08:00
yuzhaojing
e88b5fd450
[HUDI-3120] Cache compactionPlan in buffer ( #4463 )
...
Co-authored-by: yuzhaojing <yuzhaojing@bytedance.com >
2021-12-31 13:12:32 +08:00
yuzhaojing
0f0088fe4b
[HUDI-3124] Bootstrap when timeline have completed instant ( #4467 )
...
Co-authored-by: yuzhaojing <yuzhaojing@bytedance.com >
2021-12-30 11:54:34 +08:00
Ron
674c149234
[HUDI-3083] Support component data types for flink bulk_insert ( #4470 )
...
* [HUDI-3083] Support component data types for flink bulk_insert
* add nested row type test
2021-12-30 11:15:54 +08:00
Sivabalan Narayanan
5c0e4ce005
Revert "[HUDI-3043] Revert async cleaner leak commit to unblock CI failure ( #4343 )" ( #4465 )
...
This reverts commit 7e7ad1558c .
2021-12-30 10:45:09 +08:00
yuzhaojing
15eb7e81fc
[HUDI-2547] Schedule Flink compaction in service ( #4254 )
...
Co-authored-by: yuzhaojing <yuzhaojing@bytedance.com >
2021-12-22 15:08:47 +08:00
Danny Chan
d0087d4040
[HUDI-3037] Add back remote view storage config for flink ( #4338 )
2021-12-17 13:57:53 +08:00
Sivabalan Narayanan
7e7ad1558c
[HUDI-3043] Revert async cleaner leak commit to unblock CI failure ( #4343 )
...
* Revert "[HUDI-2959] Fix the thread leak of cleaning service (#4252 )"
Reverting to unblock CI failure for now. will revisit this with the right fix
2021-12-16 21:51:28 -05:00
Fugle666
29bc5fd912
[HUDI-2996] Flink streaming reader 'skip_compaction' option does not work ( #4304 )
...
close apache/hudi#4304
2021-12-14 12:21:09 +08:00
Danny Chan
8dd0444ef9
[HUDI-2984] Implement #close for AbstractTableFileSystemView ( #4285 )
2021-12-11 16:19:10 +08:00
Danny Chan
2dcb3f0062
[HUDI-2985] Shade jackson for hudi flink bundle jar ( #4284 )
2021-12-11 14:40:57 +08:00
Danny Chan
9bdcee00c0
[HUDI-2959] Fix the thread leak of cleaning service ( #4252 )
2021-12-11 12:08:47 +08:00
yuzhaojing
3ad9b121f1
[HUDI-2912] Fix CompactionPlanOperator typo ( #4187 )
...
Co-authored-by: yuzhaojing <yuzhaojing@bytedance.com >
2021-12-10 09:32:53 -08:00
Danny Chan
bd08470421
[HUDI-2957] Shade kryo jar for flink bundle jar ( #4251 )
2021-12-09 10:16:42 +08:00
Danny Chan
e8473b9a2b
[HUDI-2951] Disable remote view storage config for flink ( #4237 )
2021-12-07 18:04:15 +08:00
Ron
a8fb69656f
[HUDI-2877] Support flink catalog to help user use flink table conveniently ( #4153 )
...
* [HUDI-2877] Support flink catalog to help user use flink table conveniently
* Fix comment
* fix comment2
2021-12-05 10:14:29 +08:00
Danny Chan
0699521f83
[HUDI-2924] Refresh the fs view on successful checkpoints for write profile ( #4199 )
2021-12-03 16:12:59 +08:00
Danny Chan
f74b3d12aa
[minor] Refactor write profile to always generate fs view ( #4198 )
2021-12-03 11:38:29 +08:00
Danny Chan
934fe54cc5
[HUDI-2914] Fix remote timeline server config for flink ( #4191 )
2021-12-03 08:59:10 +08:00