Danny Chan
22c45a7704
[HUDI-4188] Fix flaky ITTestDataSTreamWrite.testWriteCopyOnWrite ( #5749 )
2022-06-06 12:12:48 +08:00
marchpure
73b0be3c96
[HUDI-4192] HoodieHFileReader scan top cells after bottom cells throw NullPointerException ( #5755 )
...
SeekTo top cells avoid NullPointerException
2022-06-06 12:07:26 +08:00
Y Ethan Guo
5d18b80343
[HUDI-4190] Include hbase-protocol for shading in the bundles ( #5750 )
2022-06-05 17:42:16 -07:00
Saisai Shao
bd26d633d7
[HUDI-4168] Add Call Procedure for marker deletion ( #5738 )
...
* Add Call Procedure for marker deletion
2022-06-05 11:05:38 +08:00
Nicolas Paris
80783c27f5
[HUDI-4187] Fix partition order in aws glue sync ( #5731 )
2022-06-04 02:16:52 -07:00
leesf
3759a38b99
[HUDI-4183] Fix using HoodieCatalog to create non-hudi tables ( #5743 )
2022-06-03 17:16:48 +08:00
KnightChess
51602a34f7
[HUDI-4179] Cluster with sort cloumns invalid ( #5739 )
2022-06-02 20:28:21 +08:00
Danny Chan
7f8630cc57
[HUDI-4167] Remove the timeline refresh with initializing hoodie table ( #5716 )
...
The timeline refresh on table initialization invokes the fs view #sync, which has two actions now:
1. reload the timeline of the fs view, so that the next fs view request is based on this timeline metadata
2. if this is a local fs view, clear all the local states; if this is a remote fs view, send request to sync the remote fs view
But, let's see the construction, the meta client is instantiated freshly so the timeline is already the latest,
the table is also constructed freshly, so the fs view has no local states, that means, the #sync is unnecessary totally.
In this patch, the metadata lifecycle and data set fs view are kept in sync, when the fs view is refreshed, the underneath metadata
is also refreshed synchronouly. The freshness of the metadata follows the same rules as data fs view:
1. if the fs view is local, the visibility is based on the client table metadata client's latest commit
2. if the fs view is remote, the timeline server would #sync the fs view and metadata together based on the lagging server local timeline
From the perspective of client, no need to care about the refresh action anymore no matter whether the metadata table is enabled or not.
That make the client logic more clear and less error-prone.
Removes the timeline refresh has another benefit: if avoids unncecessary #refresh of the remote fs view, if all the clients send request to #sync the
remote fs view, the server would encounter conflicts and the client encounters a response error.
2022-06-02 09:48:48 +08:00
Qi Ji
7276d0eaa6
[HUDI-3670] free temp views in sql transformers ( #5080 )
2022-06-01 07:35:40 -07:00
Sagar Sumit
dfcd6d9a86
[HUDI-4011] Add hudi-aws-bundle ( #5674 )
...
Co-authored-by: Raymond Xu <2701446+xushiyan@users.noreply.github.com >
2022-06-01 05:30:29 -07:00
Danny Chan
0d069b5e57
[HUDI-4174] Add hive conf dir option for flink sink ( #5725 )
2022-06-01 16:17:36 +08:00
Kumud Kumar Srivatsava Tirupati
795a99ba73
[HUDI-4107] Added --sync-tool-classes config option in HoodieMultiTableDeltaStreamer ( #5597 )
...
* added --sync-tool-classes config option in multitable delta streamer
* added a testcase to assert if syncClientToolClassNames is getting picked to the deltastreamer execution context
2022-05-31 20:27:50 +05:30
Jin Xing
918c4f4e0b
[HUDI-4149] Drop-Table fails when underlying table directory is broken ( #5672 )
2022-05-30 19:09:26 +08:00
Danny Chan
329da34ee0
[HUDI-4163] Catch general exception instead of IOException while fetching rollback plan during rollback ( #5703 )
...
If the avro file is corrupted, an InvalidAvroMagicException throws.
2022-05-30 13:08:02 +08:00
苏承祥
7e86884604
[HUDI-4086] Use CustomizedThreadFactory in async compaction and clustering ( #5563 )
...
Co-authored-by: 苏承祥 <sucx@tuya.com >
2022-05-28 22:35:47 -07:00
Raymond Xu
0a72458291
[HUDI-3551] Fix testStorageSchemes for oci storage ( #5711 )
2022-05-28 12:13:37 -07:00
Carter Shanklin
62d792368b
[HUDI-3551] Add the Oracle Cloud Infrastructure (oci) Object Storage URI scheme ( #4952 )
2022-05-28 08:26:14 -07:00
uday08bce
48062a5708
[HUDI-4166] Added SimpleClient plugin for integ test ( #5710 )
2022-05-28 08:20:52 -07:00
ForwardXu
8fa8f26031
[MINOR] Fix Hive and meta sync config for sql statement ( #5316 )
2022-05-28 07:56:39 -07:00
wangxianghu
58014c147a
[HUDI-4160] Make database regex of MaxwellJsonKafkaSourcePostProcessor optional ( #5697 )
2022-05-28 11:13:24 +04:00
Bo Cui
93fe5a497e
[HUDI-4151] flink split_reader supports rocksdb ( #5675 )
...
* [HUDI-4151] flink split_reader supports rocksdb
2022-05-28 08:37:34 +08:00
RexAn
554caa3421
[MINOR] Fix the issue when handling conf hoodie.datasource.write.operation=bulk_insert in sql mode ( #5679 )
...
Co-authored-by: Rex An <bonean131@gmail.com >
2022-05-27 04:45:09 -07:00
Alexey Kudinkin
1767ff5e7c
[HUDI-4161] Make sure partition values are taken from partition path ( #5699 )
2022-05-27 02:36:30 -07:00
watermelon12138
57dbe57bed
[HUDI-4162] Fixed some constant mapping issues. ( #5700 )
...
Co-authored-by: y00617041 <yangxuan42@huawei.com >
2022-05-27 14:08:54 +08:00
YueZhang
85962ee55d
[HUDI-3963][RFC-53] Use Lock-Free Message Queue Disruptor Improving Hoodie Writing Efficiency ( #5567 )
...
Co-authored-by: yuezhang <yuezhang@freewheel.tv >
2022-05-26 23:03:09 +08:00
komao
8d2f009048
[HUDI-4124] Add valid check in Spark Datasource configs ( #5637 )
...
Co-authored-by: wangzixuan.wzxuan <wangzixuan.wzxuan@bytedance.com >
2022-05-26 05:21:28 -07:00
Sagar Sumit
31e13db1f0
[HUDI-4023] Decouple hudi-spark from hudi-utilities-slim-bundle ( #5641 )
2022-05-26 11:28:49 +05:30
RexAn
98c5c6c654
[HUDI-4040] Bulk insert Support CustomColumnsSortPartitioner with Row ( #5502 )
...
* Along the lines of RDDCustomColumnsSortPartitioner but for Row
2022-05-26 10:39:04 +05:30
Danny Chan
4e42ed5eae
[HUDI-4145] Archives the metadata file in HoodieInstant.State sequence (part2) ( #5676 )
2022-05-26 11:21:39 +08:00
Sagar Sumit
cf837b4900
[HUDI-3193] Decouple hudi-aws from hudi-client-common ( #5666 )
...
Move HoodieMetricsCloudWatchConfig to hudi-client-common
2022-05-25 19:38:56 +05:30
冯健
a6bc9e8e81
[HUDI-4146] Claim RFC-55 for Improve Hive/Meta sync class design and hierachies ( #5682 )
2022-05-25 05:31:39 -07:00
luoyajun
f30b3aef3e
[MINOR] Fix a potential NPE and some finer points of hudi cli ( #5656 )
2022-05-24 11:13:18 -07:00
Zhaojing Yu
18635b533e
Merge pull request #3599 from yuzhaojing/HUDI-2207
...
[HUDI-2207] Support independent flink hudi clustering function
2022-05-25 00:47:28 +08:00
Sivabalan Narayanan
10363c1412
[HUDI-4132] Fixing determining target table schema for delta sync with empty batch ( #5648 )
2022-05-24 08:17:15 -04:00
喻兆靖
c20db99a7b
[HUDI-2207] Support independent flink hudi clustering function
2022-05-24 20:16:48 +08:00
liujinhui
0caa55ecb4
[HUDI-4135] remove netty and netty-all ( #5663 )
2022-05-24 03:56:28 -07:00
Danny Chan
eb219010d2
[HUDI-4145] Archives the metadata file in HoodieInstant.State sequence ( #5669 )
2022-05-24 17:33:30 +08:00
Sivabalan Narayanan
c05ebf2417
[HUDI-2473] Fixing compaction write operation in commit metadata ( #5203 )
2022-05-24 13:03:21 +05:30
Danny Chan
676d5cefe0
[HUDI-4138] Fix the concurrency modification of hoodie table config for flink ( #5660 )
...
* Remove the metadata cleaning strategy for flink, that means the multi-modal index may be affected
* Improve the HoodieTable#clearMetadataTablePartitionsConfig to only update table config when necessary
* Remove the modification of read code path in HoodieTableConfig
2022-05-24 13:07:55 +08:00
Sivabalan Narayanan
af1128acf9
[HUDI-4084] Add support to test async table services with integ test suite framework ( #5557 )
...
* Add support to test async table services with integ test suite framework
* Make await time for validation configurable
2022-05-24 08:35:56 +05:30
Heap
47b764ec33
[HUDI-4134] Fix Method naming consistency issues in FSUtils ( #5655 )
2022-05-23 15:28:48 -07:00
felixYyu
716e995a38
[MINOR] Removing redundant semicolons and line breaks ( #5662 )
2022-05-23 15:26:36 -07:00
Y Ethan Guo
752f956f03
[HUDI-3933] Add UT cases to cover different key gen ( #5638 )
2022-05-23 06:48:09 -07:00
Sagar Sumit
42c7129e25
[HUDI-4142] Claim RFC-54 for new table APIs ( #5665 )
2022-05-23 18:10:07 +05:30
YuangZhang
3ef137d156
[HUDI-4129] Initializes a new fs view for WriteProfile#reload ( #5640 )
...
Co-authored-by: zhangyuang <zhangyuang@corp.netease.com >
2022-05-23 09:57:34 +08:00
Raymond Xu
271d1a79c0
[HUDI-4051] Allow nested field as primary key and preCombineField in spark sql ( #5517 )
...
* [HUDI-4051] Allow nested field as preCombineField in spark sql
* relax validation for primary key
2022-05-22 00:47:51 -07:00
uday08bce
32a5d268f5
[HUDI-3890] fix rat plugin issue with sql files ( #5644 )
2022-05-21 12:22:55 -04:00
Jin Xing
922f765ead
[HUDI-4100] CTAS failed to clean up when given an illegal MANAGED table definition ( #5588 )
2022-05-21 22:41:18 +08:00
YueZhang
8ec625d4d5
[HUDI-3858] Shade javax.servlet for Spark bundle jar ( #5295 )
...
Co-authored-by: yuezhang <yuezhang@freewheel.tv >
2022-05-21 21:16:14 +08:00
Raymond Xu
b5adba3e55
[MINOR] remove unused gson test dependency ( #5652 )
2022-05-21 05:34:08 -07:00