1
0
Commit Graph

2977 Commits

Author SHA1 Message Date
jiz
af9f09047d [HUDI-3509] Add call procedure for HoodieLogFileCommand (#5949)
Co-authored-by: zhanshaoxiong <jiimmyzhan@tencent.com>
2022-06-24 10:16:54 +08:00
Sagar Sumit
eeb78f23e6 [HUDI-4290] Fix fetchLatestBaseFiles to filter replaced filegroups (#5941)
* [HUDI-4290] Fix fetchLatestBaseFiles to filter replaced filegroups

* Separate out incremental sync fsview test with clustering
2022-06-23 19:40:08 +05:30
Forus
38ff18a199 [HUDI-4299] Fix problem about hudi-example-java run failed on idea. (#5936) 2022-06-23 21:46:22 +08:00
jiz
1bb017d396 [HUDI-3508] Add call procedure for FileSystemViewCommand (#5929)
* [HUDI-3508] Add call procedure for FileSystemView

* minor

Co-authored-by: jiimmyzhan <jiimmyzhan@tencent.com>
2022-06-22 17:50:20 +08:00
Danny Chan
1dbd9d407a [minor] following 4270, add unit tests for the keys lost case (#5918) 2022-06-22 16:56:06 +08:00
LinMingQiang
c9590790f8 [HUDI-4279] Strength the remote fs view lagging check when latest commit refresh is enabled (#5917)
Signed-off-by: LinMingQiang <1356469429@qq.com>
2022-06-22 10:32:21 +08:00
Zhaojing Yu
c7e430bb46 Revert master (#5925)
* Revert "udate"

This reverts commit 092e35c1e3.

* Revert "[HUDI-3475] Initialize hudi table management module."

This reverts commit 4640a3bbb8.
2022-06-21 16:58:50 +08:00
喻兆靖
092e35c1e3 udate 2022-06-21 15:22:04 +08:00
喻兆靖
4640a3bbb8 [HUDI-3475] Initialize hudi table management module. 2022-06-21 15:21:30 +08:00
Bo Cui
7c4aaa9715 [HUDI-4270] Bootstrap op data loading missing (#5888) 2022-06-21 11:47:39 +08:00
Shawn Chang
5c204f1416 [HUDI-4177] Fix hudi-cli rollback with rollbackUsingMarkers method call (#5734)
* Fix hudi-cli rollback with rollbackUsingMarkers method call
* Add test for hudi-cli rollbackUsingMarkers

Co-authored-by: Shawn Chang <yxchang@amazon.com>
2022-06-21 10:54:12 +08:00
Forus
ba4d5bd847 [HUDI-4251] Fix the problem that the command 'commits sync' description does not match. (#5881) 2022-06-20 16:03:58 -07:00
RexAn
17ac5a4573 [HUDI-4173] Fix wrong results if the user read no base files hudi table by glob paths (#5723) 2022-06-20 23:02:34 +05:30
Y Ethan Guo
7601e9e4c7 [MINOR] Update DOAP with 0.11.1 Release (#5908) 2022-06-20 09:27:35 -07:00
Alexander Trushev
f1103281d2 [HUDI-4258] Fix when HoodieTable removes data file before the end of Flink job (#5876)
* [HUDI-4258] Fix when HoodieTable removes data file before the end of Flink job
2022-06-20 17:07:49 +08:00
luokey
7c6bedff25 [HUDI-4259] Flink create avro schema not conformance to standards (#5878)
* flink create avro schema not conformance to standards

Co-authored-by: 854194341@qq.com <loukey_7821>
2022-06-20 15:41:23 +08:00
felixYyu
d7facb8cb8 fix remove redundant Variable (#5806) 2022-06-20 15:21:49 +08:00
Shizhi Chen
7481eacf23 [HUDI-4277] supoort flink table source with computed column (#5897)
Co-authored-by: chenshizhi <chenshizhi@bilibili.com>
2022-06-20 15:19:32 +08:00
5herhom
efafb79eeb [MINOR] Add "spillable_map_path" in FlinkCompactionConfig. To avoid the disk space of "/tmp" full when compacting offline. (#5905) 2022-06-20 15:15:23 +08:00
huberylee
d4f0326b4b [HUDI-4275] Refactor rollback inflight instant for clustering/compaction to reuse some code (#5894) 2022-06-20 14:29:21 +08:00
ForwardXu
c5c4cfec91 [HUDI-3507] Support export command based on Call Produce Command (#5901) 2022-06-19 18:48:22 +08:00
huberylee
fec49dc12b [HUDI-4165] Support Create/Drop/Show/Refresh Index Syntax for Spark SQL (#5761)
* Support Create/Drop/Show/Refresh Index Syntax for Spark SQL
2022-06-17 18:33:58 +08:00
董可伦
7689e62cd9 [HUDI-4265] Deprecate useless targetTableName parameter in HoodieMultiTableDeltaStreamer (#5883) 2022-06-17 16:57:14 +08:00
KnightChess
0ff34b6974 [HUDI-4214] improve repeat init write schema in ExpressionPayload (#5820)
* [HUDI-4214] improve repeat init write schema in ExpressionPayload
2022-06-16 17:58:37 +08:00
KnightChess
2bf0a1906d [HUDI-4217] improve repeat init object in ExpressionPayload (#5825) 2022-06-15 20:21:28 +08:00
董可伦
c291b05699 [HUDI-4218] [HUDI-4218] Expose the real exception information when an exception occurs in the tableExists method (#5827) 2022-06-15 18:10:35 +08:00
superche
7b946cf351 [HUDI-3499] Add Call Procedure for show rollbacks (#5848)
* Add Call Procedure for show rollbacks

* fix

* add ut for show_rollback_detail and exception handle

Co-authored-by: superche <superche@tencent.com>
2022-06-15 16:50:15 +08:00
Danny Chan
0811bb38fb [HUDI-4255] Make the flink merge and replace handle intermediate file visible (#5866) 2022-06-15 14:23:23 +08:00
Danny Chan
25bbff64cf [minor] Following HUDI-4207, remote the new wrapper #init method (#5865) 2022-06-15 08:48:13 +08:00
felixYyu
f16b1e8982 [MINOR] Fix typo of DisruptorExecutor in RFC 53 (#5860) 2022-06-13 23:30:17 -07:00
HunterXHunter
264b15df87 [HUDI-4207] HoodieFlinkWriteClient.getOrCreateWriteHandle throws an e… (#5788)
Adding more logs to assist in debugging with HoodieFlinkWriteClient.getOrCreateWriteHandle throwing exception
2022-06-13 10:36:06 -04:00
Qi Ji
4774c4248f [HUDI-4006] failOnDataLoss on delta-streamer kafka sources (#5718)
add new config key hoodie.deltastreamer.source.kafka.enable.failOnDataLoss
when failOnDataLoss=false (current behaviour, the default), log a warning instead of seeking to earliest silently
when failOnDataLoss is set, fail explicitly
2022-06-13 10:31:57 -04:00
luoyajun
0d859fe58b [HUDI-3863] Add UT for drop partition column in deltastreamer testsuite (#5727) 2022-06-13 10:29:32 -04:00
xi chaomin
e89f5627e4 [HUDI-3682] testReaderFilterRowKeys fails in TestHoodieOrcReaderWriter (#5790)
TestReaderFilterRowKeys needs to get the key from RECORD_KEY_METADATA_FIELD, but the writer in current UT does not populate the meta field and the schema does not contains meta fields.

This fix writes data with schema which contains meta fields and calls writeAvroWithMetadata for writing.

Co-authored-by: xicm <xicm@asiainfo.com>
2022-06-13 10:22:12 -04:00
superche
14d8735a1c Strip extra spaces when creating new configuration (#5849)
Co-authored-by: superche <superche@tencent.com>
2022-06-13 19:10:38 +08:00
sandyfog
c82e3462e3 [MINOR] fix AvroSchemaConverter duplicate branch in 'switch' (#5813) 2022-06-13 10:55:24 +08:00
Shiyan Xu
5aaac21d1d [HUDI-4224] Fix CI issues (#5842)
- Upgrade junit to 5.7.2
- Downgrade surefire and failsafe to 2.22.2
- Fix test failures that were previously not reported
- Improve azure pipeline configs

Co-authored-by: liujinhui1994 <965147871@qq.com>
Co-authored-by: Y Ethan Guo <ethan.guoyihua@gmail.com>
2022-06-12 11:44:18 -07:00
Y Ethan Guo
fd8f7c5f6c [HUDI-4205] Fix NullPointerException in HFile reader creation (#5841)
Replace SerializableConfiguration with SerializableWritable for broadcasting the hadoop configuration before initializing HFile readers
2022-06-11 14:46:43 -07:00
Y Ethan Guo
97ccf5dd18 [HUDI-4223] Fix NullPointerException from getLogRecordScanner when reading metadata table (#5840)
When explicitly specifying the metadata table path for reading in spark, the "hoodie.metadata.enable" is overwritten to true for proper read behavior.
2022-06-11 13:19:24 -07:00
Sivabalan Narayanan
08fe281091 [HUDI-4221] Fixing getAllPartitionPaths perf hit w/ FileSystemBackedMetadata (#5829) 2022-06-11 13:17:42 -07:00
xi chaomin
2b3a85528a [HUDI-3889] Do not validate table config if save mode is set to Overwrite (#5619)
Co-authored-by: xicm <xicm@asiainfo.com>
2022-06-09 19:23:51 -04:00
yanenze
ba47904fa2 [HUDI-4139]improvement for flink write operator name to identify tables easily (#5744)
Co-authored-by: yanenze <yanenze@keytop.com.cn>
2022-06-09 17:48:20 -04:00
Danny Chan
c608dbd6c2 [HUDI-4213] Infer keygen clazz for Spark SQL (#5815) 2022-06-09 20:37:58 +08:00
sandyfog
8ff17b0470 [MINOR] FlinkStateBackendConverter add more exception message (#5809)
* [MINOR] FlinkStateBackendConverter add more  exception message
2022-06-09 15:13:27 +08:00
liuzhuang2017
f5ab921300 [MINOR][DOCS] Update the README.md file in hudi-examples (#5803) 2022-06-08 17:45:00 -07:00
Alexey Kudinkin
35afdb4316 [HUDI-4178] Addressing performance regressions in Spark DataSourceV2 Integration (#5737)
There are multiple issues with our current DataSource V2 integrations: b/c we advertise Hudi tables as V2, Spark expects it to implement certain APIs which are not implemented at the moment, instead we're using custom Resolution rule (in HoodieSpark3Analysis) to instead manually fallback to V1 APIs.  This commit fixes the issue by reverting DSv2 APIs and making Spark use V1, except for schema evaluation logic.
2022-06-07 16:30:46 -07:00
Raymond Xu
1349b596a1 [HUDI-4198] Fix hive config for AWSGlueClientFactory (#5768)
* HiveConf needs to load fs conf to allow instantiation via AWSGlueClientFactory

* Resolve metastore uri config before loading fs conf

* Skip hiveql due to CI issue

Co-authored-by: Sagar Sumit <sagarsumit09@gmail.com>
2022-06-07 20:21:31 +05:30
Sivabalan Narayanan
f85cd9b16d [HUDI-4200] Fixing sorting of keys fetched from metadata table (#5773)
- Key fetched from metadata table especially from base file reader is not sorted. and hence may result in throwing NPE (key prefix search) or unnecessary seeks to starting of Hfile (full key look ups). Fixing the same in this patch. This is not an issue with log blocks, since sorting is taking care within HoodieHfileDataBlock.
- Commit where the sorting was mistakenly reverted [HUDI-3760] Adding capability to fetch Metadata Records by prefix  #5208
2022-06-07 08:19:52 -04:00
YueZhang
4f5cad8029 [MINOR][RFC-53] Fix typos (#5764)
Co-authored-by: yuezhang <yuezhang@freewheel.tv>
2022-06-07 08:28:28 +08:00
Raymond Xu
e5710a8e7c [MINOR] Mark AWSGlueCatalogSyncClient experimental (#5775) 2022-06-07 08:25:59 +08:00