1
0
Commit Graph

1669 Commits

Author SHA1 Message Date
Danny Chan
d424fe6072 [HUDI-2121] Add operator uid for flink stateful operators (#3212) 2021-07-02 19:44:32 +08:00
pengzhiwei
ac65189458 [HUDI-2114] Spark Query MOR Table Written By Flink Return Incorrect Timestamp Value (#3208) 2021-07-02 17:39:57 +08:00
Danny Chan
7462fdefc3 [HUDI-2112] Support reading pure logs file group for flink batch reader after compaction (#3202) 2021-07-02 16:29:22 +08:00
pengzhiwei
6403547431 [HUDI-2051] Enable Hive Sync When Spark Enable Hive Meta For Spark Sql (#3126) 2021-07-02 01:08:36 -07:00
pengzhiwei
6eca06d074 [HUDI-2105] Compaction Failed For MergeInto MOR Table (#3190) 2021-07-01 23:40:14 +08:00
wangxianghu
b376cefc3e [MINOR] Add Documentation to KEYGENERATOR_TYPE_PROP (#3196) 2021-07-01 18:48:59 +08:00
pengzhiwei
b34d53fa9c [HUDI-2088] Missing Partition Fields And PreCombineField In Hoodie Properties For Table Written By Flink (#3171) 2021-07-01 17:25:18 +08:00
vinoth chandar
d07def1290 [MINOR] Fix broken build due to FlinkOptions (#3198) 2021-06-30 20:34:58 -07:00
vinoth chandar
7895a3586e [MINOR] Update .asf.yaml to codify notification settings, turn on jira comments, gh discussions (#3164)
- Turn on comment for jira, so we can track PR activity better
- Create a notification settings that match https://gitbox.apache.org/schemes.cgi?hudi
- Try and turn on "discussions" on Github, to experiment
2021-06-30 14:56:56 -07:00
wenningd
d412fb2fe6 [HUDI-89] Add configOption & refactor all configs based on that (#2833)
Co-authored-by: Wenning Ding <wenningd@amazon.com>
2021-06-30 14:26:30 -07:00
yuzhaojing
07e93de8b4 [HUDI-2052] Support load logFile in BootstrapFunction (#3134)
Co-authored-by: 喻兆靖 <yuzhaojing@bilibili.com>
2021-06-30 20:37:00 +08:00
Vinay Patil
94f0f40fec [HUDI-1944] Support Hudi to read from committed offset (#3175)
* [HUDI-1944] Support Hudi to read from committed offset

* [HUDI-1944] Adding group option to KafkaResetOffsetStrategies

* [HUDI-1944] Update Exception msg
2021-06-30 16:41:28 +08:00
yuzhaojing
1cbf43b6e7 [HUDI-2103] Add rebalance before index bootstrap (#3185)
Co-authored-by: 喻兆靖 <yuzhaojing@bilibili.com>
2021-06-30 16:40:55 +08:00
Sivabalan Narayanan
5564c7ec01 [HUDI-2006] Adding more yaml templates to test suite (#3073) 2021-06-29 23:05:46 -04:00
wangxianghu
202887b8ca [HUDI-2092] Fix NPE caused by FlinkStreamerConfig#writePartitionUrlEncode null value (#3176) 2021-06-30 09:21:06 +08:00
swuferhong
f665db071f [HUDI-2085] Support specify compaction paralleism and compaction target io for flink batch compaction (#3169) 2021-06-29 22:53:01 +08:00
swuferhong
5a7d1b3d6c [HUDI-2097] Fix Flink unable to read commit metadata error (#3180) 2021-06-29 22:43:47 +08:00
Danny Chan
b8a8f572d6 [HUDI-2094] Supports hive style partitioning for flink writer (#3178) 2021-06-29 15:34:26 +08:00
Raymond Xu
0749cc826a [HUDI-2081] Move schema util tests out from TestHiveSyncTool (#3166) 2021-06-29 11:23:46 +08:00
yuzhaojing
37b7c65d8a [HUDI-2084] Resend the uncommitted write metadata when start up (#3168)
Co-authored-by: 喻兆靖 <yuzhaojing@bilibili.com>
2021-06-29 08:53:52 +08:00
Vinay Patil
039aeb6dce [HUDI-1910] Commit Offset to Kafka after successful Hudi commit (#3092) 2021-06-28 21:52:05 +08:00
Vinay Patil
34fc8a8880 [HUDI-2067] Sync FlinkOptions config to FlinkStreamerConfig (#3151) 2021-06-28 19:26:08 +08:00
wangxianghu
9e61dad597 [MINOR] Drop duplicate keygenerator class configuration setting (#3167) 2021-06-28 17:11:32 +08:00
Danny Chan
d24341d10c [HUDI-2074] Use while loop instead of recursive call in MergeOnReadInputFormat#MergeIterator to avoid StackOverflow (#3159) 2021-06-28 16:03:10 +08:00
zhangyue19921010
e99a6b031b [HUDI-2073] Fix the bug of hoodieClusteringJob never quit (#3157)
Co-authored-by: yuezhang <yuezhang@freewheel.tv>
2021-06-26 22:03:41 -07:00
wangxianghu
f73bedd374 [MINOR] Remove unused methods (#3152) 2021-06-26 13:19:26 +08:00
Vinay Patil
ed1a5daa9a [HUDI-2060] Added tests for KafkaOffsetGen (#3136) 2021-06-25 12:37:47 -04:00
n3nash
23dbc09a0d [MINOR] Removing un-used files and references (#3150) 2021-06-24 22:17:40 -07:00
s-sanjay
0fb8556b0d Add ability to provide multi-region (global) data consistency across HMS in different regions (#2542)
[global-hive-sync-tool] Add a global hive sync tool to sync hudi table across clusters. Add a way to rollback the replicated time stamp if we fail to sync or if we partly sync

Co-authored-by: Jagmeet Bali <jsbali@uber.com>
2021-06-24 20:26:26 -07:00
Danny Chan
e64fe55054 [HUDI-2068] Skip the assign state for SmallFileAssign when the state can not assign initially (#3148) 2021-06-25 08:57:56 +08:00
yuzhaojing
218f2a6df8 [HUDI-2062] Catch FileNotFoundException in WriteProfiles #getCommitMetadata Safely (#3138)
Co-authored-by: 喻兆靖 <yuzhaojing@bilibili.com>
2021-06-25 08:54:59 +08:00
Sebastian Bernauer
b32855545b [HUDI-2069] Fix KafkaAvroSchemaDeserializer to not rely on reflection (#3111)
[HUDI-2069] KafkaAvroSchemaDeserializer should get sourceSchema passed instead using Reflection
2021-06-24 09:08:21 -04:00
pengzhiwei
84dd3ca18b [HUDI-2053] Insert Static Partition With DateType Return Incorrect Partition Value (#3133) 2021-06-24 19:09:37 +08:00
pengzhiwei
7e50f9a5a6 [HUDI-2061] Incorrect Schema Inference For Schema Evolved Table (#3137) 2021-06-23 22:48:01 -07:00
leesf
e039e0ff6d [HUDI-2064] Fix TestHoodieBackedMetadata#testOnlyValidPartitionsAdded (#3141) 2021-06-24 07:37:55 +08:00
yuzhaojing
380518e232 [HUDI-2038] Support rollback inflight compaction instances for CompactionPlanOperator (#3105)
Co-authored-by: 喻兆靖 <yuzhaojing@bilibili.com>
2021-06-23 20:58:52 +08:00
Vaibhav Sinha
43b9c1fa1c [HUDI-1826] Add ORC support in HoodieSnapshotExporter (#3130) 2021-06-23 17:04:25 +08:00
Danny Chan
2687eab8f0 [HUDI-2054] Remove the duplicate name for flink write pipeline (#3135) 2021-06-23 14:49:38 +08:00
swuferhong
3fb59dda83 [HUDI-1988] FinalizeWrite() been executed twice in AbstractHoodieWriteClient$commitstats (#3050) 2021-06-22 22:57:09 -07:00
Prashant Wason
11e64b2db0 [HUDI-1717] Metadata Reader should merge all the un-synced but complete instants from the dataset timeline. (#3082) 2021-06-22 23:52:18 +08:00
Prashant Wason
062d5baf84 [HUDI-2013] Removed option to fallback to file listing when Metadata Table is enabled. (#3079) 2021-06-22 23:41:52 +08:00
pengzhiwei
69c0d9e2d0 [HUDI-1883] Support Truncate Table For Hoodie (#3098) 2021-06-22 22:33:20 +08:00
yuzhaojing
5db37c255b [HUDI-2047] Ignore FileNotFoundException in WriteProfiles #getWritePathsOfInstant (#3125)
Co-authored-by: 喻兆靖 <yuzhaojing@bilibili.com>
2021-06-22 14:18:46 +08:00
Rong Ma
7bd517a82f [HUDI-2031] JVM occasionally crashes during compaction when spark speculative execution is enabled (#3093)
* unit tests added
2021-06-21 18:09:51 -07:00
swuferhong
cb5cd35991 [HUDI-2043] HoodieDefaultTimeline$filterPendingCompactionTImeline() method have wrong filter condition (#3109) 2021-06-21 17:53:54 -07:00
pengzhiwei
4fd8a88b7e [HUDI-1776] Support AlterCommand For Hoodie (#3086) 2021-06-21 22:58:43 +08:00
swuferhong
f8d9242372 [HUDI-2050] Support rollback inflight compaction instances for batch flink compactor (#3124) 2021-06-21 20:32:48 +08:00
Danny Chan
adf167991a [HUDI-2049] StreamWriteFunction should wait for the next inflight instant time before flushing (#3123) 2021-06-21 20:15:27 +08:00
Sagar Sumit
429e9fb5fe [HUDI-1248] Increase timeout for deltaStreamerTestRunner in TestHoodieDeltaStreamer (#3110) 2021-06-20 21:42:12 -07:00
Raymond Xu
e41f13fe7b [MINOR] Put Azure cache tasks first (#3118) 2021-06-20 14:36:39 -07:00