zhangyue19921010
b4441abcf7
[HUDI-2194] Skip the latest N partitions when choosing partitions to create ClusteringPlan ( #3300 )
...
* skip from latest partitions based on hoodie.clustering.plan.strategy.daybased.skipfromlatest.partitions && 0(default means skip nothing)
* change config verison
* add ut
Co-authored-by: yuezhang <yuezhang@freewheel.tv >
2021-08-09 10:10:15 -07:00
pengzhiwei
41a9986a76
[HUDI-2208] Support Bulk Insert For Spark Sql ( #3328 )
2021-08-09 00:18:31 -04:00
yuzhaojing
11ea74958d
[HUDI-2247] Filter file where length less than parquet MAGIC length ( #3363 )
...
Co-authored-by: 喻兆靖 <yuzhaojing@bilibili.com >
2021-08-09 09:15:42 +08:00
pengzhiwei
32a50d8ddb
[HUDI-2243] Support Time Travel Query For Hoodie Table ( #3360 )
2021-08-07 19:07:22 -04:00
pengzhiwei
55d2e786db
[HUDI-1842] Spark Sql Support For pre-existing Hoodie Table ( #3393 )
2021-08-07 07:49:26 -04:00
Sagar Sumit
70b6bd485f
[HUDI-1468] Support custom clustering strategies and preserve commit metadata as part of clustering ( #3419 )
...
Co-authored-by: Satish Kotha <satishkotha@uber.com >
2021-08-06 22:53:08 -04:00
pengzhiwei
9ce548edb1
[MINOR] fix compile error in compaction command ( #3421 )
2021-08-06 16:18:19 +08:00
pengzhiwei
3f8ca1a355
[HUDI-2182] Support Compaction Command For Spark Sql ( #3277 )
2021-08-06 15:12:10 +08:00
Danny Chan
20feb1a897
[HUDI-2278] Use INT64 timestamp with precision 3 for flink parquet writer ( #3414 )
2021-08-06 11:06:21 +08:00
Danny Chan
b7586a5632
[HUDI-2274] Allows INSERT duplicates for Flink MOR table ( #3403 )
2021-08-06 10:30:52 +08:00
pengzhiwei
0dcd6a8fca
[HUDI-2233] Use HMS To Sync Hive Meta For Spark Sql ( #3387 )
2021-08-05 09:57:22 -04:00
Sivabalan Narayanan
1df5ded433
[HUDI-2273] Migrating some long running tests to functional test profile ( #3398 )
2021-08-04 19:08:50 -04:00
pengzhiwei
5574e092fb
[HUDI-2232] [SQL] MERGE INTO fails with table having nested struct ( #3379 )
2021-08-04 18:20:29 +08:00
yuzhaojing
b8b9d6db83
[HUDI-2087] Support Append only in Flink stream ( #3390 )
...
Co-authored-by: 喻兆靖 <yuzhaojing@bilibili.com >
2021-08-04 17:53:20 +08:00
Danny Chan
02331fc223
[HUDI-2258] Metadata table for flink ( #3381 )
2021-08-04 10:54:55 +08:00
rmahindra123
b4c14eaa29
[HUDI-2090] Ensure Disk Maps create a subfolder with appropriate prefixes and cleans them up on close ( #3329 )
...
* Add UUID to the folder name for External Spillable File System
* Fix to ensure that Disk maps folders do not interefere across users
* Fix test
* Fix test
* Rebase with latest mater and address comments
* Add Shutdown Hooks for the Disk Map
Co-authored-by: Rajesh Mahindra <rmahindra@Rajeshs-MacBook-Pro.local >
2021-08-03 17:51:25 -07:00
wenningd
91bb0d1318
[HUDI-2255] Refactor Datasource options ( #3373 )
...
Co-authored-by: Wenning Ding <wenningd@amazon.com >
2021-08-03 17:50:30 -07:00
Udit Mehrotra
1ff2d3459a
[HUDI-1371] [HUDI-1893] Support metadata based listing for Spark DataSource and Spark SQL ( #2893 )
2021-08-03 14:47:40 -07:00
rmahindra123
245e1fd17d
[HUDI-2272] Pass base file format to sync clients ( #3397 )
...
Co-authored-by: Rajesh Mahindra <rmahindra@Rajeshs-MacBook-Pro.local >
2021-08-03 14:46:02 -07:00
satishkotha
826a04d142
[HUDI-2072] Add pre-commit validator framework ( #3153 )
...
* [HUDI-2072] Add pre-commit validator framework
* trigger Travis rebuild
2021-08-03 12:07:45 -07:00
Danny Chan
bec23bda50
[HUDI-2269] Release the disk map resource for flink streaming reader ( #3384 )
2021-08-03 13:55:35 +08:00
Sagar Sumit
aa857beee0
[HUDI-2225] Add a compaction job in hudi-examples ( #3347 )
2021-08-03 11:31:56 +08:00
vinoth chandar
b21ae68e67
[MINOR] Improving runtime of TestStructuredStreaming by 2 mins ( #3382 )
2021-08-02 13:42:46 -07:00
Sivabalan Narayanan
fe508376fa
[HUDI-2177][HUDI-2200] Adding virtual keys support for MOR table ( #3315 )
2021-08-02 09:45:09 -04:00
zhangyue19921010
dde57b293c
[HUDI-2164] Let users build cluster plan and execute this plan at once using HoodieClusteringJob for async clustering ( #3259 )
...
* add --mode schedule/execute/scheduleandexecute
* fix checkstyle
* add UT testHoodieAsyncClusteringJobWithScheduleAndExecute
* log changed
* try to make ut success
* try to fix ut
* modify ut
* review changed
* code review
* code review
* code review
* code review
Co-authored-by: yuezhang <yuezhang@freewheel.tv >
2021-08-02 08:07:59 +08:00
Gary Li
6353fc865f
[HUDI-2218] Fix missing HoodieWriteStat in HoodieCreateHandle ( #3341 )
2021-07-30 02:36:57 -07:00
swuferhong
f7f5d4cc6d
[HUDI-2184] Support setting hive sync partition extractor class based on flink configuration ( #3284 )
2021-07-30 17:24:00 +08:00
Danny Chan
c4e45a0010
[HUDI-2254] Builtin sort operator for flink bulk insert ( #3372 )
2021-07-30 16:58:11 +08:00
swuferhong
8b19ec9ca0
[HUDI-2252] Default consumes from the latest instant for flink streaming reader ( #3368 )
2021-07-30 14:25:05 +08:00
Sivabalan Narayanan
7bdae69053
[HUDI-2253] Refactoring few tests to reduce runningtime. DeltaStreamer and MultiDeltaStreamer tests. Bulk insert row writer tests ( #3371 )
...
Co-authored-by: Sivabalan Narayanan <nsb@Sivabalans-MBP.attlocal.net >
2021-07-29 22:22:26 -07:00
pengzhiwei
c2370402ea
[HUDI-2251] Fix Exception Cause By Table Name Case Sensitivity For Append Mode Write ( #3367 )
2021-07-29 17:36:56 -04:00
Shawy Geng
44e41dc9bb
[HUDI-2117] Unpersist the input rdd after the commit is completed to … ( #3207 )
...
Co-authored-by: Vinoth Chandar <vinoth@apache.org >
2021-07-29 08:16:58 -07:00
pengzhiwei
f109c6cb0d
[MINOR] fix check style error ( #3365 )
2021-07-29 17:29:10 +08:00
pengzhiwei
bbadac7de1
[HUDI-1425] Performance loss with the additional hoodieRecords.isEmpty() in HoodieSparkSqlWriter#write ( #2296 )
2021-07-28 21:30:18 -07:00
Danny Chan
efbbb67420
[HUDI-2241] Explicit parallelism for flink bulk insert ( #3357 )
2021-07-29 09:57:37 +08:00
swuferhong
7739518879
[HUDI-2228] Add option 'hive_sync.mode' for flink writer ( #3352 )
2021-07-28 19:45:50 +08:00
swuferhong
eedfadeb46
[HUDI-2244] Fix database alreadyExists exception while hive sync ( #3361 )
2021-07-28 19:40:16 +08:00
Danny Chan
91c2213412
[HUDI-2245] BucketAssigner generates the fileId evenly to avoid data skew ( #3362 )
2021-07-28 19:26:37 +08:00
davehagman
8105cf588e
[HUDI-2230] Make codahale times transient to avoid serializable exceptions ( #3345 )
2021-07-28 14:45:09 +08:00
rmahindra123
8fef50e237
[HUDI-2044] Integrate consumers with rocksDB and compression within External Spillable Map ( #3318 )
2021-07-28 01:31:03 -04:00
mincwang
00cd35f90a
[HUDI-2215] Add rateLimiter when Flink writes to hudi. ( #3338 )
...
Co-authored-by: wangminchao <wangminchao@asinking.com >
2021-07-28 08:23:23 +08:00
Danny Chan
60758b36ea
[HUDI-2227] Only sync hive meta on successful commit for flink batch writer ( #3351 )
2021-07-27 20:10:08 +08:00
pengzhiwei
59ff8423f9
[HUDI-2223] Fix Alter Partitioned Table Failed ( #3350 )
2021-07-27 20:01:04 +08:00
Gary Li
925873bb3c
[HUDI-2217] Fix no value present in incremental query on MOR ( #3340 )
2021-07-27 17:30:01 +08:00
Danny Chan
ab2e0d0ba2
[HUDI-2219] Fix NPE of HoodieConfig ( #3342 )
2021-07-27 15:18:05 +08:00
Danny Chan
9d2a65a6a6
[HUDI-2209] Bulk insert for flink writer ( #3334 )
2021-07-27 10:58:23 +08:00
xiang2102
024cf01f02
[MINOR] Correct the words accroding in the comments to according ( #3343 )
...
Correct the words 'accroding' in the comments to 'according'
2021-07-27 08:48:58 +08:00
Sivabalan Narayanan
61148c1c43
[HUDI-2176, 2178, 2179] Adding virtual key support to COW table ( #3306 )
2021-07-26 17:21:04 -04:00
xiarixiaoyao
5353243449
[HUDI-2214]residual temporary files after clustering are not cleaned up ( #3335 )
2021-07-26 10:26:20 -07:00
Gary Li
a5638b995b
[MINOR] Close log scanner after compaction completed ( #3294 )
2021-07-26 17:39:13 +08:00