Nicolas Paris
37b15ff458
[HUDI-3147] Add endpoint_url to dynamodb lock provider ( #4500 )
...
Co-authored-by: Nicolas Paris <nicolas.paris@adevinta.com >
2022-01-04 16:42:28 -05:00
Manoj Govindassamy
bf4e3d63e7
[HUDI-3141] Metadata merged log record reader - avoiding NullPointerException when records by keys ( #4505 )
...
- HoodieMetadataMergedLogRecordReader#getRecordsByKeys() and its parent class methods
are not thread safe. When multiple queries come in for gettting log records
by keys, they all operate on the same log record reader instance provided by
HoodieBackedTableMetadata#openReadersIfNeeded() and they trip over each other
as they clear/put/get the same class memeber records.
- The fix is to streamline the mutatation to class member records. Making
HoodieMetadataMergedLogRecordReader#getRecordsByKeys() a synchronized method
to avoid concurrent log records readers getting into NPE.
2022-01-04 16:41:33 -05:00
Sagar Sumit
aaf5727495
[HUDI-2774] Handle duplicate instants when fetching pending clustering plans ( #4118 )
2022-01-04 16:32:05 -05:00
Sivabalan Narayanan
7329d229d5
Adding tests to validate different key generators ( #4473 )
2022-01-04 10:48:04 +05:30
leesf
29ab6fb9ad
[HUDI-3140] Fix bulk_insert failure on Spark 3.2.0 ( #4498 )
2022-01-04 09:59:59 +08:00
harshal
2b2ae34cb9
[HUDI-2558] Fixing Clustering w/ sort columns with null values fails ( #4404 )
2022-01-03 12:19:43 +05:30
Raymond Xu
0273f2e65d
[MINOR] Update README.md ( #4492 )
...
Update Spark 3 build instructions
2022-01-02 20:34:37 -08:00
YueZhang
1e2d2c437d
[HUDI-3138] Fix broken UT test for TestHiveSyncTool.testDropPartitions ( #4493 )
...
Co-authored-by: yuezhang <yuezhang@freewheel.tv >
2022-01-02 22:43:30 -05:00
Yann Byron
fe9406dd33
[HUDI-3131] fix ctas error in spark3.1.1 ( #4476 )
2022-01-02 03:06:55 -08:00
Yann Byron
1622b52c9c
[HUDI-3136] Fix merge/insert/show partitions error on Spark3.2 ( #4490 )
2022-01-02 02:42:10 -08:00
leesf
188d0338c4
[HUDI-3134] Fix insert error after adding columns on Spark 3.2.0 ( #4488 )
2022-01-01 17:38:14 -08:00
Aimiyoo
bfa169d808
[HUDI-3040] Fix HoodieSparkBootstrapExample error info for usage ( #4341 )
2021-12-31 23:38:38 -08:00
YueZhang
ef9923fc55
[HUDI-3107]Fix HiveSyncTool drop partitions using JDBC or hivesql or hms ( #4453 )
...
* constructDropPartitions when drop partitions using jdbc
* done
* done
* code style
* code review
Co-authored-by: yuezhang <yuezhang@freewheel.tv >
2021-12-31 15:56:33 +08:00
Yuwei XIAO
2444f40a4b
[HUDI-3095] abstract partition filter logic to enable code reuse ( #4454 )
...
* [HUDI-3095] abstract partition filter logic to enable code reuse
* [HUDI-3095] address reviews
2021-12-31 11:07:52 +05:30
yuzhaojing
e88b5fd450
[HUDI-3120] Cache compactionPlan in buffer ( #4463 )
...
Co-authored-by: yuzhaojing <yuzhaojing@bytedance.com >
2021-12-31 13:12:32 +08:00
Shawy Geng
a4e622ac61
[HUDI-1951] Add bucket hash index, compatible with the hive bucket ( #3173 )
...
* [HUDI-2154] Add index key field to HoodieKey
* [HUDI-2157] Add the bucket index and its read/write implemention of Spark engine.
* revert HUDI-2154 add index key field to HoodieKey
* fix all comments and introduce a new tricky way to get index key at runtime
support double insert for bucket index
* revert spark read optimizer based on bucket index
* add the storage layout
* index tag, hash function and add ut
* fix ut
* address partial comments
* Code review feedback
* add layout config and docs
* fix ut
* rename hoodie.layout and rebase master
Co-authored-by: Vinoth Chandar <vinoth@apache.org >
2021-12-30 12:38:26 -08:00
yuzhaojing
0f0088fe4b
[HUDI-3124] Bootstrap when timeline have completed instant ( #4467 )
...
Co-authored-by: yuzhaojing <yuzhaojing@bytedance.com >
2021-12-30 11:54:34 +08:00
董可伦
436becf3ea
[HUDI-2675] Fix the exception 'Not an Avro data file' when archive and clean ( #4016 )
2021-12-29 22:53:17 -05:00
Ron
674c149234
[HUDI-3083] Support component data types for flink bulk_insert ( #4470 )
...
* [HUDI-3083] Support component data types for flink bulk_insert
* add nested row type test
2021-12-30 11:15:54 +08:00
Sivabalan Narayanan
5c0e4ce005
Revert "[HUDI-3043] Revert async cleaner leak commit to unblock CI failure ( #4343 )" ( #4465 )
...
This reverts commit 7e7ad1558c .
2021-12-30 10:45:09 +08:00
ForwardXu
504747ecf4
[HUDI-3108] Fix Purge Drop MOR Table Cause error ( #4455 )
2021-12-29 20:23:23 +08:00
xuzifu666
a29b27c7ca
[MINOR] HoodieInstantTimeGenerator improve method used ( #4462 )
2021-12-29 18:43:16 +08:00
Udit Mehrotra
9412281cb1
[HUDI-2983] Remove Log4j2 transitive dependencies ( #4281 )
2021-12-28 07:15:05 -08:00
Sivabalan Narayanan
3d7a8695cd
Fixing dynamoDbLockConfig required prop check ( #4422 )
2021-12-28 15:56:30 +05:30
Yann Byron
05942e018c
[HUDI-2811] Support Spark 3.2 ( #4270 )
2021-12-28 00:12:44 -08:00
ForwardXu
32505d5adb
[HUDI-3106] Fix HiveSyncTool not sync schema ( #4452 )
2021-12-27 22:11:14 -08:00
Yann Byron
1f7afba5e4
[HUDI-3093] fix spark-sql query table that write with TimestampBasedKeyGenerator ( #4416 )
2021-12-27 21:39:52 -08:00
harshal
6409fc733d
[HUDI-2374] Fixing AvroDFSSource does not use the overridden schema to deserialize Avro binaries ( #4353 )
2021-12-27 23:01:21 -05:00
ForwardXu
282aa68552
[HUDI-3099] Purge drop partition for spark sql ( #4436 )
2021-12-28 09:38:26 +08:00
Danny Chan
c81df99e50
[HUDI-3102] Do not store rollback plan in inflight instant ( #4445 )
2021-12-25 18:10:43 +08:00
Danny Chan
7b07aac286
[HUDI-3101] Excluding compaction instants from pending rollback info ( #4443 )
2021-12-25 14:10:45 +08:00
xuzifu666
4721073b43
[MINOR] Remove unused method in HoodieActiveTimeline ( #4435 )
2021-12-24 22:29:34 +08:00
xuzifu666
032b883bd1
[HUDI-3014] Add table option to set utc timezone ( #4306 )
2021-12-23 16:27:45 +08:00
Aimiyoo
57f43de1ea
[MINOR] Fix DedupeSparkJob typo ( #4418 )
2021-12-22 11:51:26 -08:00
ForwardXu
5d93edc539
[HUDI-3060] drop table for spark sql ( #4364 )
2021-12-22 19:17:43 +08:00
Sivabalan Narayanan
1a5f8693aa
[HUDI-3011] Adding ability to read entire data with HoodieIncrSource with empty checkpoint ( #4334 )
...
* Adding ability to read entire data with HoodieIncrSource with empty checkpoint
* Addressing comments
2021-12-22 15:43:06 +05:30
xiarixiaoyao
b5890cd17d
Merge pull request #4308 from harsh1231/HUDI-3008
...
[HUDI-3008] Fixing HoodieFileIndex partition column parsing for nested fields
2021-12-22 16:46:57 +08:00
yuzhaojing
15eb7e81fc
[HUDI-2547] Schedule Flink compaction in service ( #4254 )
...
Co-authored-by: yuzhaojing <yuzhaojing@bytedance.com >
2021-12-22 15:08:47 +08:00
Danny Chan
f1286c2c76
[HUDI-3032] Do not clean the log files right after compaction for metadata table ( #4336 )
2021-12-22 11:10:27 +08:00
Aimiyoo
92f54ce3d8
[HUDI-3027] Update hudi-examples README.md ( #4330 )
2021-12-21 13:36:03 -08:00
harshal patil
7d046f914a
[HUDI-3008] Fixing HoodieFileIndex partition column parsing for nested fields
2021-12-21 11:54:52 +05:30
Raymond Xu
32a44bbe06
[HUDI-2970] Add test for archiving replace commit ( #4345 )
2021-12-21 00:01:59 -05:00
zhangyue19921010
f3f6112b75
[HUDI-3070] Add rerunFailingTestsCount for flakly testes ( #4398 )
...
Co-authored-by: yuezhang <yuezhang@freewheel.tv >
2021-12-20 19:59:50 -08:00
Sivabalan Narayanan
982ae3d1eb
[MINOR] Increasing CI timeout to 90 mins ( #4407 )
2021-12-20 20:27:22 -05:00
xuzifu666
f166ddad12
[MINOR] Remove unused method in HoodieActiveTimeline ( #4401 )
2021-12-20 22:19:37 +08:00
xuzifu666
3ca92108b2
remove unused import ( #4349 )
2021-12-20 16:32:41 +08:00
Manoj Govindassamy
4a48f99a59
[HUDI-3064][HUDI-3054] FileSystemBasedLockProviderTestClass tryLock fix and TestHoodieClientMultiWriter test fixes ( #4384 )
...
- Made FileSystemBasedLockProviderTestClass thread safe and fixed the
tryLock retry logic.
- Made TestHoodieClientMultiWriter. testHoodieClientBasicMultiWriter
deterministic in verifying the HoodieWriteConflictException.
2021-12-19 13:31:02 -05:00
Sivabalan Narayanan
03f71ef1a2
[HUDI-2970] Adding tests for archival of replace commit actions ( #4268 )
2021-12-18 23:59:39 -08:00
Danny Chan
478f9f3695
[minor] fix NetworkUtils#getHostname ( #4355 )
2021-12-19 10:09:48 +08:00
Raymond Xu
bb99836841
[HUDI-3052] Fix flaky testJsonKafkaSourceResetStrategy ( #4381 )
2021-12-18 20:58:51 -05:00