Danny Chan
fe53bd2dea
[HUDI-2677] Add DFS based message queue for flink writer[part3] ( #4961 )
2022-03-08 15:43:21 +08:00
Bo
b6bdb46f7f
[MINOR][HUDI-3460]Fix HoodieDataSourceITCase
...
close #4959
2022-03-08 12:18:43 +08:00
todd5167
34bc752853
[HUDI-3573] flink cleanFuntion execute clean on initialization ( #4936 )
...
For flink insert overwrite operation, do the cleaning each time before the write.
2022-03-08 11:53:54 +08:00
Alexey Kudinkin
a66fd40692
[HUDI-3365] Make sure Metadata Table records are updated appropriately on HDFS ( #4739 )
...
- This change makes sure MT records are updated appropriately on HDFS: previously after Log File append operations MT records were updated w/ just the size of the deltas being appended to the original files, which have been found to be the cause of issues in case of Rollbacks that were instead updating MT with records bearing the full file-size.
- To make sure that we hedge against similar issues going f/w, this PR alleviates this discrepancy and streamlines the flow of MT table always ingesting records bearing full file-sizes.
2022-03-07 15:38:27 -05:00
Sivabalan Narayanan
6a46130037
[HUDI-2761] Fixing timeline server for repeated refreshes ( #4812 )
...
* Fixing timeline server for repeated refreshes
2022-03-05 10:04:16 +08:00
Bo Cui
0986d5a01d
[HUDI-3460] Add reader merge memory option for flink ( #4911 )
...
* flink TM memory Optimization
2022-03-04 19:29:29 +08:00
Danny Chan
1d57bd17c2
[minor] Cosmetic changes following HUDI-3315 ( #4934 )
2022-03-02 17:44:52 +08:00
Gary Li
10d866f083
[HUDI-3315] RFC-35 Part-1 Support bucket index in Flink writer ( #4679 )
...
* Support bucket index in Flink writer
* Use record key as default index key
2022-03-02 15:14:44 +08:00
yuzhaojing
3b2da9f138
[HUDI-2631] In CompactFunction, set up the write schema each time with the latest schema ( #4000 )
...
Co-authored-by: yuzhaojing <yuzhaojing@bytedance.com >
2022-03-02 11:18:17 +08:00
stayrascal
3cfb52c413
[MINOR] fix get builtin function issue from Hudi catalog ( #4917 )
2022-03-02 11:16:19 +08:00
yuzhaojing
44b8ab6048
[HUDI-3418] Save timeout option for remote RemoteFileSystemView ( #4809 )
...
Co-authored-by: yuzhaojing <yuzhaojing@bytedance.com >
2022-02-28 15:16:40 -05:00
Bo Cui
193215201c
[MINOR] Change MINI_BATCH_SIZE to 2048 ( #4862 )
...
ParquetColumnarRowSplitReader#batchSize is 2048, so Changing MINI_BATCH_SIZE to 2048 will reduce memory cache.
2022-02-28 10:45:28 +08:00
Raymond Xu
c77b2591d0
[HUDI-2439] Remove SparkBoundedInMemoryExecutor ( #4860 )
2022-02-26 08:02:12 -05:00
Danny Chan
a4ee7463ae
[HUDI-3474] Add more document to Pipelines for the usage of this tool to build a write pipeline ( #4906 )
2022-02-25 19:08:51 +08:00
yanenze
943b99775b
[HUDI-3488] The flink small file list should exclude file slices with pending compaction ( #4893 )
...
# this happens when the async-compaction has been configured
Co-authored-by: yanenze <yanenze@keytop.com.cn >
2022-02-24 14:45:03 +08:00
Danny Chan
4affdd0c8f
[HUDI-3461] The archived timeline for flink streaming reader should not be reused ( #4861 )
...
* Before the patch, the flink streaming reader caches the meta client thus the archived timeline,
when fetching the instant details from the reused timeline, the exception throws
* Add a method in HoodieTableMetaClient to return a fresh new archived timeline each time
2022-02-22 15:54:29 +08:00
Bo Cui
83279971a1
[HUDI-3446] Supports batch reader in BootstrapOperator#loadRecords ( #4837 )
...
* [HUDI-3446] Supports batch Reader in BootstrapOperator#loadRecords
2022-02-19 21:21:48 +08:00
stayrascal
f15125c0cd
[HUDI-3389] fix ColumnarArrayData ClassCastException issue ( #4842 )
...
* [HUDI-3389] fix ColumnarArrayData ClassCastException issue
* [HUDI-3389] remove MapColumnVector.java, RowColumnVector.java, and add test case for array<int> field
2022-02-19 10:56:41 +08:00
RexAn
5009138d04
[HUDI-3438] Avoid getSmallFiles if hoodie.parquet.small.file.limit is 0 ( #4823 )
...
Co-authored-by: Hui An <hui.an@shopee.com >
2022-02-18 08:57:04 -05:00
zhangxiang17
433c2573ef
[HUDI-3442]Duplicate code calls for 'FlinkOptions.flatOptions' ( #4832 )
2022-02-17 11:04:09 +08:00
Alexey Kudinkin
aaddaf524a
[HUDI-3280] Cleaning up Hive-related hierarchies after refactoring ( #4743 )
2022-02-16 15:36:37 -08:00
Raymond Xu
538ec44fa8
[HUDI-2931] Add config to disable table services ( #4777 )
2022-02-15 09:49:53 -05:00
Yann Byron
cb6ca7f0d1
[HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re… ( #4714 )
2022-02-14 23:38:38 -05:00
YueZhang
76e2faa28d
[HUDI-3370] The files recorded in the commit may not match the actual ones for MOR Compaction ( #4753 )
...
* use HoodieCommitMetadata to replace writeStatuses computation
Co-authored-by: yuezhang <yuezhang@freewheel.tv >
2022-02-14 11:12:52 +08:00
Danny Chan
b3b44236fe
[HUDI-3389] Bump flink version to 1.14.3 ( #4776 )
2022-02-10 11:32:01 +08:00
ForwardXu
773b317983
[HUDI-2941] Show _hoodie_operation in spark sql results ( #4649 )
2022-02-07 06:28:13 -08:00
Y Ethan Guo
b8601a9f58
[HUDI-2656] Generalize HoodieIndex for flexible record data type ( #3893 )
...
Co-authored-by: Raymond Xu <2701446+xushiyan@users.noreply.github.com >
2022-02-03 20:24:04 -08:00
todd5167
2969fb3835
[HUDI-3233] Make metadata commit synchronous for flink batch
...
close apache/hudi#4561
2022-01-12 20:22:53 +08:00
Town
4b0111974f
[HUDI-3184] hudi-flink support timestamp-micros ( #4548 )
...
* support both avro and parquet code path
* string rowdata conversion is also supported
2022-01-12 10:53:51 +08:00
Sagar Sumit
827549949c
[HUDI-2909] Handle logical type in TimestampBasedKeyGenerator ( #4203 )
...
* [HUDI-2909] Handle logical type in TimestampBasedKeyGenerator
Timestampbased key generator was returning diff values for row writer and non row writer path. this patch fixes it and is guarded by a config flag (`hoodie.datasource.write.keygenerator.consistent.logical.timestamp.enabled`)
2022-01-08 10:22:44 -05:00
fengli
205e48f53f
[HUDI-3132] Minor fixes for HoodieCatalog
...
close apache/hudi#4486
2022-01-06 11:17:23 +08:00
yuzhaojing
e88b5fd450
[HUDI-3120] Cache compactionPlan in buffer ( #4463 )
...
Co-authored-by: yuzhaojing <yuzhaojing@bytedance.com >
2021-12-31 13:12:32 +08:00
yuzhaojing
0f0088fe4b
[HUDI-3124] Bootstrap when timeline have completed instant ( #4467 )
...
Co-authored-by: yuzhaojing <yuzhaojing@bytedance.com >
2021-12-30 11:54:34 +08:00
Ron
674c149234
[HUDI-3083] Support component data types for flink bulk_insert ( #4470 )
...
* [HUDI-3083] Support component data types for flink bulk_insert
* add nested row type test
2021-12-30 11:15:54 +08:00
Sivabalan Narayanan
5c0e4ce005
Revert "[HUDI-3043] Revert async cleaner leak commit to unblock CI failure ( #4343 )" ( #4465 )
...
This reverts commit 7e7ad1558c .
2021-12-30 10:45:09 +08:00
yuzhaojing
15eb7e81fc
[HUDI-2547] Schedule Flink compaction in service ( #4254 )
...
Co-authored-by: yuzhaojing <yuzhaojing@bytedance.com >
2021-12-22 15:08:47 +08:00
Danny Chan
d0087d4040
[HUDI-3037] Add back remote view storage config for flink ( #4338 )
2021-12-17 13:57:53 +08:00
Sivabalan Narayanan
7e7ad1558c
[HUDI-3043] Revert async cleaner leak commit to unblock CI failure ( #4343 )
...
* Revert "[HUDI-2959] Fix the thread leak of cleaning service (#4252 )"
Reverting to unblock CI failure for now. will revisit this with the right fix
2021-12-16 21:51:28 -05:00
Fugle666
29bc5fd912
[HUDI-2996] Flink streaming reader 'skip_compaction' option does not work ( #4304 )
...
close apache/hudi#4304
2021-12-14 12:21:09 +08:00
Danny Chan
8dd0444ef9
[HUDI-2984] Implement #close for AbstractTableFileSystemView ( #4285 )
2021-12-11 16:19:10 +08:00
Danny Chan
2dcb3f0062
[HUDI-2985] Shade jackson for hudi flink bundle jar ( #4284 )
2021-12-11 14:40:57 +08:00
Danny Chan
9bdcee00c0
[HUDI-2959] Fix the thread leak of cleaning service ( #4252 )
2021-12-11 12:08:47 +08:00
yuzhaojing
3ad9b121f1
[HUDI-2912] Fix CompactionPlanOperator typo ( #4187 )
...
Co-authored-by: yuzhaojing <yuzhaojing@bytedance.com >
2021-12-10 09:32:53 -08:00
Danny Chan
bd08470421
[HUDI-2957] Shade kryo jar for flink bundle jar ( #4251 )
2021-12-09 10:16:42 +08:00
Danny Chan
e8473b9a2b
[HUDI-2951] Disable remote view storage config for flink ( #4237 )
2021-12-07 18:04:15 +08:00
Ron
a8fb69656f
[HUDI-2877] Support flink catalog to help user use flink table conveniently ( #4153 )
...
* [HUDI-2877] Support flink catalog to help user use flink table conveniently
* Fix comment
* fix comment2
2021-12-05 10:14:29 +08:00
Danny Chan
0699521f83
[HUDI-2924] Refresh the fs view on successful checkpoints for write profile ( #4199 )
2021-12-03 16:12:59 +08:00
Danny Chan
f74b3d12aa
[minor] Refactor write profile to always generate fs view ( #4198 )
2021-12-03 11:38:29 +08:00
Danny Chan
934fe54cc5
[HUDI-2914] Fix remote timeline server config for flink ( #4191 )
2021-12-03 08:59:10 +08:00
yuzhao.cyz
a1d0ff4209
Moving to 0.11.0-SNAPSHOT on master branch.
2021-11-27 17:22:10 +08:00