Sivabalan Narayanan
7329d229d5
Adding tests to validate different key generators ( #4473 )
2022-01-04 10:48:04 +05:30
leesf
29ab6fb9ad
[HUDI-3140] Fix bulk_insert failure on Spark 3.2.0 ( #4498 )
2022-01-04 09:59:59 +08:00
harshal
2b2ae34cb9
[HUDI-2558] Fixing Clustering w/ sort columns with null values fails ( #4404 )
2022-01-03 12:19:43 +05:30
Raymond Xu
0273f2e65d
[MINOR] Update README.md ( #4492 )
...
Update Spark 3 build instructions
2022-01-02 20:34:37 -08:00
YueZhang
1e2d2c437d
[HUDI-3138] Fix broken UT test for TestHiveSyncTool.testDropPartitions ( #4493 )
...
Co-authored-by: yuezhang <yuezhang@freewheel.tv >
2022-01-02 22:43:30 -05:00
Yann Byron
fe9406dd33
[HUDI-3131] fix ctas error in spark3.1.1 ( #4476 )
2022-01-02 03:06:55 -08:00
Yann Byron
1622b52c9c
[HUDI-3136] Fix merge/insert/show partitions error on Spark3.2 ( #4490 )
2022-01-02 02:42:10 -08:00
leesf
188d0338c4
[HUDI-3134] Fix insert error after adding columns on Spark 3.2.0 ( #4488 )
2022-01-01 17:38:14 -08:00
Aimiyoo
bfa169d808
[HUDI-3040] Fix HoodieSparkBootstrapExample error info for usage ( #4341 )
2021-12-31 23:38:38 -08:00
YueZhang
ef9923fc55
[HUDI-3107]Fix HiveSyncTool drop partitions using JDBC or hivesql or hms ( #4453 )
...
* constructDropPartitions when drop partitions using jdbc
* done
* done
* code style
* code review
Co-authored-by: yuezhang <yuezhang@freewheel.tv >
2021-12-31 15:56:33 +08:00
Yuwei XIAO
2444f40a4b
[HUDI-3095] abstract partition filter logic to enable code reuse ( #4454 )
...
* [HUDI-3095] abstract partition filter logic to enable code reuse
* [HUDI-3095] address reviews
2021-12-31 11:07:52 +05:30
yuzhaojing
e88b5fd450
[HUDI-3120] Cache compactionPlan in buffer ( #4463 )
...
Co-authored-by: yuzhaojing <yuzhaojing@bytedance.com >
2021-12-31 13:12:32 +08:00
Shawy Geng
a4e622ac61
[HUDI-1951] Add bucket hash index, compatible with the hive bucket ( #3173 )
...
* [HUDI-2154] Add index key field to HoodieKey
* [HUDI-2157] Add the bucket index and its read/write implemention of Spark engine.
* revert HUDI-2154 add index key field to HoodieKey
* fix all comments and introduce a new tricky way to get index key at runtime
support double insert for bucket index
* revert spark read optimizer based on bucket index
* add the storage layout
* index tag, hash function and add ut
* fix ut
* address partial comments
* Code review feedback
* add layout config and docs
* fix ut
* rename hoodie.layout and rebase master
Co-authored-by: Vinoth Chandar <vinoth@apache.org >
2021-12-30 12:38:26 -08:00
yuzhaojing
0f0088fe4b
[HUDI-3124] Bootstrap when timeline have completed instant ( #4467 )
...
Co-authored-by: yuzhaojing <yuzhaojing@bytedance.com >
2021-12-30 11:54:34 +08:00
董可伦
436becf3ea
[HUDI-2675] Fix the exception 'Not an Avro data file' when archive and clean ( #4016 )
2021-12-29 22:53:17 -05:00
Ron
674c149234
[HUDI-3083] Support component data types for flink bulk_insert ( #4470 )
...
* [HUDI-3083] Support component data types for flink bulk_insert
* add nested row type test
2021-12-30 11:15:54 +08:00
Sivabalan Narayanan
5c0e4ce005
Revert "[HUDI-3043] Revert async cleaner leak commit to unblock CI failure ( #4343 )" ( #4465 )
...
This reverts commit 7e7ad1558c .
2021-12-30 10:45:09 +08:00
ForwardXu
504747ecf4
[HUDI-3108] Fix Purge Drop MOR Table Cause error ( #4455 )
2021-12-29 20:23:23 +08:00
xuzifu666
a29b27c7ca
[MINOR] HoodieInstantTimeGenerator improve method used ( #4462 )
2021-12-29 18:43:16 +08:00
Udit Mehrotra
9412281cb1
[HUDI-2983] Remove Log4j2 transitive dependencies ( #4281 )
2021-12-28 07:15:05 -08:00
Sivabalan Narayanan
3d7a8695cd
Fixing dynamoDbLockConfig required prop check ( #4422 )
2021-12-28 15:56:30 +05:30
Yann Byron
05942e018c
[HUDI-2811] Support Spark 3.2 ( #4270 )
2021-12-28 00:12:44 -08:00
ForwardXu
32505d5adb
[HUDI-3106] Fix HiveSyncTool not sync schema ( #4452 )
2021-12-27 22:11:14 -08:00
Yann Byron
1f7afba5e4
[HUDI-3093] fix spark-sql query table that write with TimestampBasedKeyGenerator ( #4416 )
2021-12-27 21:39:52 -08:00
harshal
6409fc733d
[HUDI-2374] Fixing AvroDFSSource does not use the overridden schema to deserialize Avro binaries ( #4353 )
2021-12-27 23:01:21 -05:00
ForwardXu
282aa68552
[HUDI-3099] Purge drop partition for spark sql ( #4436 )
2021-12-28 09:38:26 +08:00
Danny Chan
c81df99e50
[HUDI-3102] Do not store rollback plan in inflight instant ( #4445 )
2021-12-25 18:10:43 +08:00
Danny Chan
7b07aac286
[HUDI-3101] Excluding compaction instants from pending rollback info ( #4443 )
2021-12-25 14:10:45 +08:00
xuzifu666
4721073b43
[MINOR] Remove unused method in HoodieActiveTimeline ( #4435 )
2021-12-24 22:29:34 +08:00
xuzifu666
032b883bd1
[HUDI-3014] Add table option to set utc timezone ( #4306 )
2021-12-23 16:27:45 +08:00
Aimiyoo
57f43de1ea
[MINOR] Fix DedupeSparkJob typo ( #4418 )
2021-12-22 11:51:26 -08:00
ForwardXu
5d93edc539
[HUDI-3060] drop table for spark sql ( #4364 )
2021-12-22 19:17:43 +08:00
Sivabalan Narayanan
1a5f8693aa
[HUDI-3011] Adding ability to read entire data with HoodieIncrSource with empty checkpoint ( #4334 )
...
* Adding ability to read entire data with HoodieIncrSource with empty checkpoint
* Addressing comments
2021-12-22 15:43:06 +05:30
xiarixiaoyao
b5890cd17d
Merge pull request #4308 from harsh1231/HUDI-3008
...
[HUDI-3008] Fixing HoodieFileIndex partition column parsing for nested fields
2021-12-22 16:46:57 +08:00
yuzhaojing
15eb7e81fc
[HUDI-2547] Schedule Flink compaction in service ( #4254 )
...
Co-authored-by: yuzhaojing <yuzhaojing@bytedance.com >
2021-12-22 15:08:47 +08:00
Danny Chan
f1286c2c76
[HUDI-3032] Do not clean the log files right after compaction for metadata table ( #4336 )
2021-12-22 11:10:27 +08:00
Aimiyoo
92f54ce3d8
[HUDI-3027] Update hudi-examples README.md ( #4330 )
2021-12-21 13:36:03 -08:00
harshal patil
7d046f914a
[HUDI-3008] Fixing HoodieFileIndex partition column parsing for nested fields
2021-12-21 11:54:52 +05:30
Raymond Xu
32a44bbe06
[HUDI-2970] Add test for archiving replace commit ( #4345 )
2021-12-21 00:01:59 -05:00
zhangyue19921010
f3f6112b75
[HUDI-3070] Add rerunFailingTestsCount for flakly testes ( #4398 )
...
Co-authored-by: yuezhang <yuezhang@freewheel.tv >
2021-12-20 19:59:50 -08:00
Sivabalan Narayanan
982ae3d1eb
[MINOR] Increasing CI timeout to 90 mins ( #4407 )
2021-12-20 20:27:22 -05:00
xuzifu666
f166ddad12
[MINOR] Remove unused method in HoodieActiveTimeline ( #4401 )
2021-12-20 22:19:37 +08:00
xuzifu666
3ca92108b2
remove unused import ( #4349 )
2021-12-20 16:32:41 +08:00
Manoj Govindassamy
4a48f99a59
[HUDI-3064][HUDI-3054] FileSystemBasedLockProviderTestClass tryLock fix and TestHoodieClientMultiWriter test fixes ( #4384 )
...
- Made FileSystemBasedLockProviderTestClass thread safe and fixed the
tryLock retry logic.
- Made TestHoodieClientMultiWriter. testHoodieClientBasicMultiWriter
deterministic in verifying the HoodieWriteConflictException.
2021-12-19 13:31:02 -05:00
Sivabalan Narayanan
03f71ef1a2
[HUDI-2970] Adding tests for archival of replace commit actions ( #4268 )
2021-12-18 23:59:39 -08:00
Danny Chan
478f9f3695
[minor] fix NetworkUtils#getHostname ( #4355 )
2021-12-19 10:09:48 +08:00
Raymond Xu
bb99836841
[HUDI-3052] Fix flaky testJsonKafkaSourceResetStrategy ( #4381 )
2021-12-18 20:58:51 -05:00
Raymond Xu
f57e28fe39
[MINOR] Azure CI IT tasks clean up ( #4337 )
2021-12-18 17:00:56 -08:00
Sivabalan Narayanan
77abb5ccb9
[HUDI-3054] Fixing default lock configs for FileSystemBasedLock and fixing a flaky test ( #4374 )
2021-12-18 16:15:48 -05:00
Sivabalan Narayanan
dc40397fa9
[HUDI-3064] Fixing a bug in TransactionManager and FileSystemTestLock ( #4372 )
2021-12-18 11:52:11 -05:00