1
0
Commit Graph

1759 Commits

Author SHA1 Message Date
Danny Chan
ab2e0d0ba2 [HUDI-2219] Fix NPE of HoodieConfig (#3342) 2021-07-27 15:18:05 +08:00
Danny Chan
9d2a65a6a6 [HUDI-2209] Bulk insert for flink writer (#3334) 2021-07-27 10:58:23 +08:00
xiang2102
024cf01f02 [MINOR] Correct the words accroding in the comments to according (#3343)
Correct the words 'accroding' in the comments to 'according'
2021-07-27 08:48:58 +08:00
Sivabalan Narayanan
61148c1c43 [HUDI-2176, 2178, 2179] Adding virtual key support to COW table (#3306) 2021-07-26 17:21:04 -04:00
xiarixiaoyao
5353243449 [HUDI-2214]residual temporary files after clustering are not cleaned up (#3335) 2021-07-26 10:26:20 -07:00
Gary Li
a5638b995b [MINOR] Close log scanner after compaction completed (#3294) 2021-07-26 17:39:13 +08:00
董可伦
a91296f14a [HUDI-2216] Correct the words fiels in the comments to fields (#3339) 2021-07-25 12:15:57 +08:00
rmahindra123
a14b19fdd5 [HUDI-1241] Automate the generation of configs webpage as configs are added to Hudi repo (#3302) 2021-07-23 21:33:34 -07:00
Xuedong Luan
b2f7fcb8c8 [MINOR] Replace deprecated method isDir with isDirectory (#3319) 2021-07-24 10:02:24 +08:00
jsbali
66207ed91a [HUDI-1848] Adding support for HMS for running DDL queries in hive-sy… (#2879)
* [HUDI-1848] Adding support for HMS for running DDL queries in hive-sync-tool

* [HUDI-1848] Fixing test cases

* [HUDI-1848] CR changes

* [HUDI-1848] Fix checkstyle violations

* [HUDI-1848] Fixed a bug when metastore api fails for complex schemas with multiple levels.

* [HUDI-1848] Adding the complex schema and resolving merge conflicts

* [HUDI-1848] Adding some more javadocs

* [HUDI-1848] Added javadocs for DDLExecutor impls

* [HUDI-1848] Fixed style issue
2021-07-23 09:03:15 -07:00
Xuedong Luan
71e14cf866 [HUDI-2213] Remove unnecessary parameter for HoodieMetrics constructor and fix NPE in UT (#3333) 2021-07-23 19:57:35 +08:00
pengzhiwei
2c910ee3af [HUDI-2212] Missing PrimaryKey In Hoodie Properties For CTAS Table (#3332) 2021-07-23 15:21:57 +08:00
Xuedong Luan
6d592c5896 [HUDI-2211] Fix NullPointerException in TestHoodieConsoleMetrics (#3331) 2021-07-23 11:22:54 +08:00
pengzhiwei
5a2f3d439e [HUDI-2139] MergeInto MOR Table May Result InCorrect Result (#3230) 2021-07-23 10:19:43 +08:00
Danny Chan
c89bf1de20 [HUDI-2205] Rollback inflight compaction for flink writer (#3320) 2021-07-22 22:56:51 +08:00
swuferhong
fe5d2e7f53 [HUDI-2206] Fix checkpoint blocked because getLastPendingInstant() action after than restoreWriteMetadata() action (#3326) 2021-07-22 16:35:07 +08:00
pengzhiwei
151f22e43a [HUDI-2195] Sync Hive Failed When Execute CTAS In Spark2 And Spark3 (#3299) 2021-07-22 15:33:38 +08:00
Danny Chan
2370a9facb [HUDI-2204] Add marker files for flink writer (#3316) 2021-07-22 13:34:15 +08:00
Vinay Patil
5a94b6bf54 [HUDI-2192] Clean up Multiple versions of scala libraries detected Warning (#3292) 2021-07-21 00:33:27 -07:00
satishkotha
4f1350f7c1 [MINOR] Disable codecov (#3314) 2021-07-20 22:07:22 -07:00
Sivabalan Narayanan
d58a8348dc [HUDI-2007] Fixing hudi_test_suite for spark nodes and adding spark bulk_insert node (#3074) 2021-07-21 00:11:01 -04:00
Danny Chan
858e84b5b2 [HUDI-2198] Clean and reset the bootstrap events for coordinator when task failover (#3304) 2021-07-21 10:13:05 +08:00
yuzhaojing
634163a990 [HUDI-2145] Create new bucket when NewFileAssignState filled (#3258)
Co-authored-by: 喻兆靖 <yuzhaojing@bilibili.com>
2021-07-20 17:46:45 +08:00
Samrat
a086d255c8 [HUDI-1860] Add INSERT_OVERWRITE and INSERT_OVERWRITE_TABLE support to DeltaStreamer (#3184) 2021-07-19 21:49:43 -04:00
Sivabalan Narayanan
d5026e9a24 [HUDI-2161] Adding support to disable meta columns with bulk insert operation (#3247) 2021-07-19 20:43:48 -04:00
喻兆靖
2099bf41db [HUDI-2193] Remove state in BootstrapFunction 2021-07-19 18:14:06 +08:00
pengzhiwei
572a214412 [HUDI-1884] MergeInto Support Partial Update For COW (#3154) 2021-07-17 12:59:18 +08:00
liujinhui
af837d2f18 [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp (#2438) 2021-07-17 00:31:06 -04:00
yuzhao.cyz
50c2b76d72 Revert "[HUDI-2087] Support Append only in Flink stream (#3252)"
This reverts commit 783c9cb3
2021-07-16 21:36:27 +08:00
yuzhao.cyz
c8aaf00819 [HUDI-2185] Remove the default parallelism of index bootstrap and bucket assigner 2021-07-16 15:44:15 +08:00
liujinhui
3b264e80d9 [HUDI-1633] Make callback return HoodieWriteStat (#2445)
* CALLBACK add partitionPath

* callback can send hoodieWriteStat

* add ApiMaturityLevel
2021-07-16 12:37:07 +08:00
Jintao Guan
38cd74b563 [MINOR] Allow users to choose ORC as base file format in Spark SQL (#3279) 2021-07-16 12:24:41 +08:00
vinoyang
a62a6cff32 [MINOR] Refactor hive sync tool to reduce duplicate code (#3276)
* [MINOR] Refactor hive sync tool to reduce duplicate code
2021-07-15 23:54:38 +08:00
moranyuwen
23a4a96eb4 [HUDI-2153] Fix BucketAssignFunction Context NullPointerException 2021-07-15 19:54:49 +08:00
rmahindra123
d024439764 [HUDI-2029] Implement compression for DiskBasedMap in Spillable Map (#3128) 2021-07-14 22:57:38 -04:00
vinoth chandar
75040ee9e5 [HUDI-2149] Ensure and Audit docs for every configuration class in the codebase (#3272)
- Added docs when missing
 - Rewrote, reworded as needed
 - Made couple more classes extend HoodieConfig
2021-07-14 10:56:08 -07:00
zhangyue19921010
c1810f210e [MINOR] Correct the logs of enable/not-enable async cleaner service. (#3271)
Co-authored-by: yuezhang <yuezhang@freewheel.tv>
2021-07-15 00:08:29 +08:00
Jintao Guan
2debb9b3ed [HUDI-1828] Update unit tests to support ORC as the base file format (#3237) 2021-07-15 00:05:42 +08:00
pengzhiwei
93967404a7 [HUDI-2180] Fix Compile Error For Spark3 (#3274) 2021-07-14 09:02:28 -07:00
vinoyang
52524b659d [HUDI-2165] Support Transformer for HoodieFlinkStreamer (#3270)
* [HUDI-2165] Support Transformer for HoodieFlinkStreamer
2021-07-14 23:01:52 +08:00
Danny Chan
632bfd1a65 Merge pull request #3268 from yuzhaojing/HUDI-2171
[HUDI-2171] Add parallelism conf for bootstrap operator
2021-07-14 17:01:30 +08:00
Danny Chan
ac75bda929 [HUDI-1969] Support reading logs for MOR Hive rt table (#3033) 2021-07-13 23:43:30 -07:00
pengzhiwei
f0a2f378ea Merge pull request #3120 from pengzhiwei2018/dev_metasync
[HUDI-2045] Support Read Hoodie As DataSource Table For Flink And DeltaStreamer
2021-07-13 22:37:20 +08:00
Vinay Patil
7395a56dfb [HUDI-2168] Fix for AccessControlException for anonymous user (#3264) 2021-07-13 08:56:51 -04:00
喻兆靖
aff1a1ed29 [HUDI-2171] Add parallelism conf for bootstrap operator 2021-07-13 17:55:12 +08:00
Sagar Sumit
b0089b894a [MINOR] Fix EXTERNAL_RECORD_AND_SCHEMA_TRANSFORMATION config (#3250) 2021-07-13 00:24:40 -04:00
zhangyue19921010
c8a2033c27 [HUDI-2144]Bug-Fix:Offline clustering(HoodieClusteringJob) will cause insert action losing data (#3240)
* fixed

* add testUpsertPartitionerWithSmallFileHandlingAndClusteringPlan ut

* fix CheckStyle

Co-authored-by: yuezhang <yuezhang@freewheel.tv>
2021-07-12 18:14:17 -07:00
pengzhiwei
ca440ccf88 [HUDI-2107] Support Read Log Only MOR Table For Spark (#3193) 2021-07-12 17:31:23 +08:00
pengzhiwei
ffa934182a [HUDI-2045] Support Read Hoodie As DataSource Table For Flink And DeltaStreamer 2021-07-12 13:03:14 +08:00
Sagar Sumit
5804ad8e32 [HUDI-1483] Support async clustering for deltastreamer and Spark streaming (#3142)
- Integrate async clustering service with HoodieDeltaStreamer and HoodieStreamingSink
- Added methods in HoodieAsyncService to reuse code
2021-07-11 14:43:38 -04:00