Sivabalan Narayanan
62605be413
[HUDI-3480][HUDI-3481] Enchancements to integ test suite ( #4884 )
2022-02-23 15:56:35 -05:00
leesf
2a93b8efb2
[HUDI-3489] Unify config to avoid duplicate code ( #4883 )
2022-02-23 08:14:30 -05:00
Y Ethan Guo
4e8accc179
[HUDI-3486] Fix wrong field order for constructing HoodieMetadataColumnStats ( #4875 )
2022-02-23 10:27:02 +05:30
yuzhaojing
dabae80423
[HUDI-3420] Remove duplicates type in HoodieClusteringGroup.avsc ( #4808 )
...
Co-authored-by: yuzhaojing <yuzhaojing@bytedance.com >
2022-02-23 10:49:47 +08:00
从大数据到人工智能
01cbddef78
Add hive-standalone-metastore dependency to hudi-flink-bundle module ( #4870 )
2022-02-23 09:16:21 +08:00
Sivabalan Narayanan
9678c3fbcf
[MINOR] Fixing checkpoint management in S3IncrSource ( #4871 )
2022-02-22 09:15:16 -05:00
Danny Chan
b87e95d621
[HUDI-3476] Remove the shade pattern for parquet for flink bundle jar ( #4869 )
2022-02-22 19:21:57 +08:00
Danny Chan
4affdd0c8f
[HUDI-3461] The archived timeline for flink streaming reader should not be reused ( #4861 )
...
* Before the patch, the flink streaming reader caches the meta client thus the archived timeline,
when fetching the instant details from the reused timeline, the exception throws
* Add a method in HoodieTableMetaClient to return a fresh new archived timeline each time
2022-02-22 15:54:29 +08:00
wangxianghu
4d1f74ebea
[HUDI-3464] Fix wrong exception thrown from HiveSchemaProvider ( #4865 )
2022-02-22 10:20:20 +04:00
Sivabalan Narayanan
14dbbdf4c7
[HUDI-2189] Adding delete partitions support to DeltaStreamer ( #4787 )
2022-02-22 00:01:30 -05:00
Y Ethan Guo
7e1ea06eb9
[MINOR] Fix typos and improve docs in HoodieMetadataConfig ( #4867 )
2022-02-21 19:36:20 -08:00
Prashant Wason
0dee8edc97
[HUDI-2925] Fix duplicate cleaning of same files when unfinished clean operations are present using a config. ( #4212 )
...
Co-authored-by: sivabalan <n.siva.b@gmail.com >
2022-02-21 21:53:03 -05:00
Yann Byron
0c950181aa
[HUDI-3423] upgrade spark to 3.2.1 ( #4815 )
2022-02-21 16:52:21 -08:00
RexAn
801fdab55c
[HUDI-3042] Abstract Spark update Strategy to make code more clean and remove duplicates ( #4845 )
...
Co-authored-by: Hui An <hui.an@shopee.com >
2022-02-21 06:53:09 -08:00
Pratyaksh Sharma
bf16bc122a
[HUDI-349]: Added new cleaning policy based on number of hours ( #3646 )
2022-02-21 09:04:42 -05:00
Sivabalan Narayanan
d36fe24c9e
[HUDI-3455] Fixing checkpoint management in hoodie incr source ( #4850 )
2022-02-21 08:19:57 -05:00
Sivabalan Narayanan
17cb5cb433
[HUDI-3432] Fixing restore with metadata enabled ( #4849 )
...
* Fixing restore with metadata enabled
* Fixing test failures
2022-02-21 18:25:30 +05:30
leesf
76b6ad6491
[HUDI-2732][RFC-38] Spark Datasource V2 Integration ( #3964 )
2022-02-21 20:14:07 +08:00
YueZhang
359fbfde79
[HUDI-2648] Retry FileSystem action instead of failed directly. ( #3887 )
...
Co-authored-by: yuezhang <yuezhang@freewheel.tv >
2022-02-20 15:31:31 -05:00
Raymond Xu
0938f55a2b
[HUDI-3458] Fix BulkInsertPartitioner generic type ( #4854 )
2022-02-20 13:51:58 -05:00
Sivabalan Narayanan
66ac1446dd
[MINOR] Moving spark scheduling configs out of DataSourceOptions ( #4843 )
2022-02-20 13:49:18 -05:00
Bo Cui
83279971a1
[HUDI-3446] Supports batch reader in BootstrapOperator#loadRecords ( #4837 )
...
* [HUDI-3446] Supports batch Reader in BootstrapOperator#loadRecords
2022-02-19 21:21:48 +08:00
stayrascal
f15125c0cd
[HUDI-3389] fix ColumnarArrayData ClassCastException issue ( #4842 )
...
* [HUDI-3389] fix ColumnarArrayData ClassCastException issue
* [HUDI-3389] remove MapColumnVector.java, RowColumnVector.java, and add test case for array<int> field
2022-02-19 10:56:41 +08:00
RexAn
5009138d04
[HUDI-3438] Avoid getSmallFiles if hoodie.parquet.small.file.limit is 0 ( #4823 )
...
Co-authored-by: Hui An <hui.an@shopee.com >
2022-02-18 08:57:04 -05:00
Y Ethan Guo
fba5822ee3
[HUDI-3430] Fix Deltastreamer to properly shut down the services upon failure ( #4824 )
2022-02-18 08:44:56 -05:00
luokey
de8161ae96
HoodieSortedMergeHandle#close write data disorder ( #4841 )
...
Co-authored-by: 854194341@qq.com <loukey_7821>
2022-02-18 13:31:38 +04:00
Sagar Sumit
ed106f671e
[HUDI-2809] Introduce a checksum mechanism for validating hoodie.properties ( #4712 )
...
Fix dependency conflict
Fix repairs command
Implement putIfAbsent for DDB lock provider
Add upgrade step and validate while fetching configs
Validate checksum for latest table version only while fetching config
Move generateChecksum to BinaryUtil
Rebase and resolve conflict
Fix table version check
2022-02-18 10:17:06 +05:30
Danny Chan
2844a77b43
[HUDI-3439] Remove the hive shade pattern for flink bundle jar ( #4833 )
2022-02-17 22:42:39 +08:00
zhangxiang17
433c2573ef
[HUDI-3442]Duplicate code calls for 'FlinkOptions.flatOptions' ( #4832 )
2022-02-17 11:04:09 +08:00
Sagar Sumit
ba0afe1426
[HUDI-3426] Sync datasource clustering config ( #4828 )
2022-02-16 19:02:49 -05:00
Alexey Kudinkin
aaddaf524a
[HUDI-3280] Cleaning up Hive-related hierarchies after refactoring ( #4743 )
2022-02-16 15:36:37 -08:00
YueZhang
3363c66468
[HUDI-3394] Check isWriteLockedByCurrentThread before unlock for InProcessLockProvider ( #4819 )
...
Co-authored-by: yuezhang <yuezhang@freewheel.tv >
Co-authored-by: Y Ethan Guo <ethan.guoyihua@gmail.com >
2022-02-15 22:41:25 -08:00
Y Ethan Guo
9a05940a74
[HUDI-3366] Remove hardcoded logic of disabling metadata table in tests ( #4792 )
2022-02-15 16:41:47 -05:00
Raymond Xu
538ec44fa8
[HUDI-2931] Add config to disable table services ( #4777 )
2022-02-15 09:49:53 -05:00
Yann Byron
fe02c64fea
fix build & ci ( #4822 )
2022-02-15 03:40:40 -08:00
Yann Byron
cb6ca7f0d1
[HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re… ( #4714 )
2022-02-14 23:38:38 -05:00
Raymond Xu
27bd7b538e
[HUDI-1576] Make archiving an async service ( #4795 )
2022-02-14 21:15:06 -05:00
Yann Byron
3b401d839c
[HUDI-3200] deprecate hoodie.file.index.enable and unify to use BaseFileOnlyViewRelation to handle ( #4798 )
2022-02-14 17:38:01 -08:00
YueZhang
0a97a9893a
[HUDI-3398] Fix TableSchemaResolver for all file formats and metadata table ( #4782 )
...
Co-authored-by: yuezhang <yuezhang@freewheel.tv >
2022-02-14 16:02:47 -08:00
Yuqi Gu
e639d99387
[HUDI-1657] Fix the build on aarch64, Fedora 33 ( #4617 )
2022-02-14 15:10:18 -08:00
Raymond Xu
bcfd8efe66
[MINOR] Prevent async service from starting twice ( #4801 )
2022-02-14 11:06:31 -08:00
leesf
0db1e978c6
[HUDI-3254] Introduce HoodieCatalog to manage tables for Spark Datasource V2 ( #4611 )
2022-02-14 06:26:58 -08:00
yuzhaojing
5ca4480a38
[HUDI-3417] Switch AbstractTableFileSystemView#filterBaseFileAfterPendingCompaction log level to debug ( #4805 )
...
Co-authored-by: yuzhaojing <yuzhaojing@bytedance.com >
2022-02-14 16:18:34 +08:00
董可伦
94806d5cf7
[HUDI-3272] If mode==ignore && tableExists, do not execute write logic and sync hive ( #4632 )
2022-02-14 09:22:00 +05:30
RexAn
93ee09fee8
[HUDI-3412] TypedProperties no need to create new set when check key exist or not ( #4791 )
...
Co-authored-by: Hui An <hui.an@shopee.com >
2022-02-14 11:33:29 +08:00
YueZhang
76e2faa28d
[HUDI-3370] The files recorded in the commit may not match the actual ones for MOR Compaction ( #4753 )
...
* use HoodieCommitMetadata to replace writeStatuses computation
Co-authored-by: yuezhang <yuezhang@freewheel.tv >
2022-02-14 11:12:52 +08:00
冯健
55777fec05
[HUDI-2413] fix Sql source's checkpoint issue ( #3648 )
...
* [HUDI-2413] fix Sql source's checkpoint
* Fixing sql source checkpoint handling
* Fixing docs
Co-authored-by: jian.feng <fengjian428@gmial.com >
Co-authored-by: sivabalan <n.siva.b@gmail.com >
2022-02-14 08:07:48 +05:30
Y Ethan Guo
6aba00e84f
[MINOR] Fix typos in Spark client related classes ( #4781 )
2022-02-13 06:41:58 -08:00
wangxianghu
ce9762d588
[MINOR] unused import ( #4799 )
2022-02-12 13:11:37 +04:00
zhangxiang17
9518f78610
[HUDI-3413]fix jackson parse error when empty message from JsonKafkaSource Using HoodieDeltaStreamer ( #4794 )
2022-02-12 11:37:29 +04:00