1
0
Commit Graph

2897 Commits

Author SHA1 Message Date
YueZhang
85962ee55d [HUDI-3963][RFC-53] Use Lock-Free Message Queue Disruptor Improving Hoodie Writing Efficiency (#5567)
Co-authored-by: yuezhang <yuezhang@freewheel.tv>
2022-05-26 23:03:09 +08:00
komao
8d2f009048 [HUDI-4124] Add valid check in Spark Datasource configs (#5637)
Co-authored-by: wangzixuan.wzxuan <wangzixuan.wzxuan@bytedance.com>
2022-05-26 05:21:28 -07:00
Sagar Sumit
31e13db1f0 [HUDI-4023] Decouple hudi-spark from hudi-utilities-slim-bundle (#5641) 2022-05-26 11:28:49 +05:30
RexAn
98c5c6c654 [HUDI-4040] Bulk insert Support CustomColumnsSortPartitioner with Row (#5502)
* Along the lines of RDDCustomColumnsSortPartitioner but for Row
2022-05-26 10:39:04 +05:30
Danny Chan
4e42ed5eae [HUDI-4145] Archives the metadata file in HoodieInstant.State sequence (part2) (#5676) 2022-05-26 11:21:39 +08:00
Sagar Sumit
cf837b4900 [HUDI-3193] Decouple hudi-aws from hudi-client-common (#5666)
Move HoodieMetricsCloudWatchConfig to hudi-client-common
2022-05-25 19:38:56 +05:30
冯健
a6bc9e8e81 [HUDI-4146] Claim RFC-55 for Improve Hive/Meta sync class design and hierachies (#5682) 2022-05-25 05:31:39 -07:00
luoyajun
f30b3aef3e [MINOR] Fix a potential NPE and some finer points of hudi cli (#5656) 2022-05-24 11:13:18 -07:00
Zhaojing Yu
18635b533e Merge pull request #3599 from yuzhaojing/HUDI-2207
[HUDI-2207] Support independent flink hudi clustering function
2022-05-25 00:47:28 +08:00
Sivabalan Narayanan
10363c1412 [HUDI-4132] Fixing determining target table schema for delta sync with empty batch (#5648) 2022-05-24 08:17:15 -04:00
喻兆靖
c20db99a7b [HUDI-2207] Support independent flink hudi clustering function 2022-05-24 20:16:48 +08:00
liujinhui
0caa55ecb4 [HUDI-4135] remove netty and netty-all (#5663) 2022-05-24 03:56:28 -07:00
Danny Chan
eb219010d2 [HUDI-4145] Archives the metadata file in HoodieInstant.State sequence (#5669) 2022-05-24 17:33:30 +08:00
Sivabalan Narayanan
c05ebf2417 [HUDI-2473] Fixing compaction write operation in commit metadata (#5203) 2022-05-24 13:03:21 +05:30
Danny Chan
676d5cefe0 [HUDI-4138] Fix the concurrency modification of hoodie table config for flink (#5660)
* Remove the metadata cleaning strategy for flink, that means the multi-modal index may be affected
* Improve the HoodieTable#clearMetadataTablePartitionsConfig to only update table config when necessary
* Remove the modification of read code path in HoodieTableConfig
2022-05-24 13:07:55 +08:00
Sivabalan Narayanan
af1128acf9 [HUDI-4084] Add support to test async table services with integ test suite framework (#5557)
* Add support to test async table services with integ test suite framework

* Make await time for validation configurable
2022-05-24 08:35:56 +05:30
Heap
47b764ec33 [HUDI-4134] Fix Method naming consistency issues in FSUtils (#5655) 2022-05-23 15:28:48 -07:00
felixYyu
716e995a38 [MINOR] Removing redundant semicolons and line breaks (#5662) 2022-05-23 15:26:36 -07:00
Y Ethan Guo
752f956f03 [HUDI-3933] Add UT cases to cover different key gen (#5638) 2022-05-23 06:48:09 -07:00
Sagar Sumit
42c7129e25 [HUDI-4142] Claim RFC-54 for new table APIs (#5665) 2022-05-23 18:10:07 +05:30
YuangZhang
3ef137d156 [HUDI-4129] Initializes a new fs view for WriteProfile#reload (#5640)
Co-authored-by: zhangyuang <zhangyuang@corp.netease.com>
2022-05-23 09:57:34 +08:00
Raymond Xu
271d1a79c0 [HUDI-4051] Allow nested field as primary key and preCombineField in spark sql (#5517)
* [HUDI-4051] Allow nested field as preCombineField in spark sql

* relax validation for primary key
2022-05-22 00:47:51 -07:00
uday08bce
32a5d268f5 [HUDI-3890] fix rat plugin issue with sql files (#5644) 2022-05-21 12:22:55 -04:00
Jin Xing
922f765ead [HUDI-4100] CTAS failed to clean up when given an illegal MANAGED table definition (#5588) 2022-05-21 22:41:18 +08:00
YueZhang
8ec625d4d5 [HUDI-3858] Shade javax.servlet for Spark bundle jar (#5295)
Co-authored-by: yuezhang <yuezhang@freewheel.tv>
2022-05-21 21:16:14 +08:00
Raymond Xu
b5adba3e55 [MINOR] remove unused gson test dependency (#5652) 2022-05-21 05:34:08 -07:00
wangxianghu
2af98303d3 [HUDI-4122] Fix NPE caused by adding kafka nodes (#5632) 2022-05-21 11:12:53 +08:00
Sivabalan Narayanan
7d02b1fd3c [MINOR] Minor fixes to exception log and removing unwanted metrics flush in integ test (#5646) 2022-05-21 07:27:35 +08:00
huberylee
85b146d3d5 [HUDI-3985] Refactor DLASyncTool to support read hoodie table as spark datasource table (#5532) 2022-05-20 22:25:32 +08:00
Danny Chan
c7576f7613 [HUDI-4130] Remove the upgrade/downgrade for flink #initTable (#5642) 2022-05-20 21:31:23 +08:00
aliceyyan
1da0b21edd [HUDI-4119] the first read result is incorrect when Flink upsert- Kafka connector is used in HUDi (#5626)
* HUDI-4119 the first read result is incorrect when Flink upsert- Kafka connector is used in HUDi

Co-authored-by: aliceyyan <aliceyyan@tencent.com>
2022-05-20 18:10:24 +08:00
Danny Chan
6f37863ba8 [HUDI-4114] Remove the unnecessary fs view sync for BaseWriteClient#initTable (#5617)
No need to #sync actively because the table instance is instantiated freshly,
its view manager has empty fiew instantces, the fs view would be synced lazily when
is it requested.
2022-05-19 10:59:05 +08:00
huberylee
6573469e73 [HUDI-4116] Unify clustering/compaction related procedures' output type (#5620)
* Unify clustering/compaction related procedures' output type

* Address review comments
2022-05-19 09:48:03 +08:00
Danny Chan
551aa959c5 Revert "[HUDI-3870] Add timeout rollback for flink online compaction (#5314)" (#5622)
This reverts commit 6f9b02decb.
2022-05-18 20:30:54 +08:00
cxzl25
199f64255e [HUDI-4111] Bump ANTLR runtime version in Spark 3.x (#5606) 2022-05-18 19:18:52 +08:00
Zhaojing Yu
008616c4f6 [HUDI-3942] [RFC-50] Improve Timeline Server (#5392) 2022-05-18 18:43:48 +08:00
luokey
a1017c66aa Clean the marker files for flink compaction (#5611)
Co-authored-by: 854194341@qq.com <loukey_7821>
2022-05-18 11:21:14 +08:00
Danny Chan
f1f8a1abb7 [HUDI-4109] Copy the old record directly when it is chosen for merging (#5603) 2022-05-18 10:17:00 +08:00
Danny Chan
ebbe56e862 [minor] Some code refactoring for LogFileComparator and Instant instantiation (#5600) 2022-05-18 09:30:09 +08:00
Sivabalan Narayanan
f8b9399615 [MINOR] Fixing spark long running yaml for non-partitioned (#5607) 2022-05-17 09:58:18 -04:00
BruceLin
99555c897a [HUDI-4110] Clean the marker files for flink compaction (#5604) 2022-05-17 21:09:27 +08:00
Jin Xing
d422f69a0d [HUDI-4087] Support dropping RO and RT table in DropHoodieTableCommand (#5564)
* [HUDI-4087] Support dropping RO and RT table in DropHoodieTableCommand

* Set hoodie.query.as.ro.table in serde properties
2022-05-17 14:12:50 +08:00
Danny Chan
d52d13302d [HUDI-4101] BucketIndexPartitioner should take partition path for better dispersion (#5590) 2022-05-17 10:34:57 +08:00
Danny Chan
fdd96cc97e [HUDI-4104] DeltaWriteProfile includes the pending compaction file slice when deciding small buckets (#5594) 2022-05-17 10:34:15 +08:00
Shawy Geng
ad773b3d96 [HUDI-3654] Preparations for hudi metastore. (#5572)
* [HUDI-3654] Preparations for hudi metastore.

Co-authored-by: gengxiaoyu <gengxiaoyu@bytedance.com>
2022-05-17 09:47:10 +08:00
董可伦
a7a42e4490 [HUDI-4103] [HUDI-4001] Filter the properties should not be used when create table for Spark SQL 2022-05-16 23:26:23 +08:00
Danny Chan
43e08193ef [HUDI-4098] Metadata table heartbeat for instant has expired, last heartbeat 0 (#5583) 2022-05-16 17:40:08 +08:00
Yuwei XIAO
61030d8e7a [HUDI-3123] consistent hashing index: basic write path (upsert/insert) (#4480)
1. basic write path(insert/upsert) implementation
 2. adapt simple bucket index
2022-05-16 11:07:01 +08:00
陈浩
1fded18dff fix hive sync no partition table error (#5585) 2022-05-16 09:51:24 +08:00
董可伦
75f847691f [HUDI-4001] Filter the properties should not be used when create table for Spark SQL (#5495) 2022-05-16 09:50:29 +08:00