1
0
Commit Graph

205 Commits

Author SHA1 Message Date
Danny Chan
627f20f9c5 [HUDI-2430] Make decimal compatible with hudi for flink writer (#3658) 2021-09-15 12:04:46 +08:00
rmahindra123
9735f4b8ef [HUDI-2428] Fix protocol and other issues after stress testing Hudi Kafka Connect (#3656)
* Fixes based on tests and some improvements
* Fix the issues after running stress tests
* Fixing checkstyle issues and updating README

Co-authored-by: Rajesh Mahindra <rmahindra@Rajeshs-MacBook-Pro.local>
Co-authored-by: Vinoth Chandar <vinoth@apache.org>
2021-09-14 07:14:58 -07:00
Danny Chan
89651c9408 [HUDI-2421] Catch the throwable when scheduling the cleaning task for flink writer (#3650) 2021-09-13 20:43:44 +08:00
Danny Chan
280f66e0f8 [MINOR] Fix the default parallelism of write task (#3649) 2021-09-13 11:41:49 +08:00
Danny Chan
9d5c3e5cb9 [HUDI-2415] Add more info log for flink streaming reader (#3642) 2021-09-12 10:00:17 +08:00
Danny Chan
b30c5bdaef [HUDI-2412] Add timestamp based partitioning for flink writer (#3638) 2021-09-11 13:17:16 +08:00
SteNicholas
512ca42d14 [MINOR] Correct the comment for the parallelism of tasks in FlinkOptions (#3634) 2021-09-10 13:42:11 +08:00
Danny Chan
db2ab9a150 [HUDI-2403] Add metadata table listing for flink query source (#3618) 2021-09-08 14:52:39 +08:00
Danny Chan
cf3a2ead32 [HUDI-2401] Load archived instants for flink streaming reader (#3610) 2021-09-08 10:43:54 +08:00
Danny Chan
79b896f071 [HUDI-2392] Do not send partition delete record when changelog mode enabled (#3586) 2021-09-02 20:58:12 +08:00
yuzhaojing
7a1bd225ca [HUDI-2376] Add pipeline for Append mode (#3573)
Co-authored-by: 喻兆靖 <yuzhaojing@bilibili.com>
2021-09-02 16:32:40 +08:00
Danny Chan
f66e1ce9bf [HUDI-2379] Include the pending compaction file groups for flink (#3567)
streaming reader
2021-09-01 16:47:52 +08:00
Danny Chan
57668d02a0 [HUDI-2371] Improvement flink streaming reader (#3552)
- Support reading empty table
- Fix filtering by partition path
- Support reading from earliest commit
2021-08-28 20:16:54 +08:00
mikewu
9850e90e2e [HUDI-2229] Refact HoodieFlinkStreamer to reuse the pipeline of HoodieTableSink (#3495)
Co-authored-by: mikewu <xingbo.wxb@alibaba-inc.com>
2021-08-27 10:14:04 +08:00
Danny Chan
0f39137ba8 [HUDI-2321] Use the caller classloader for ReflectionUtils (#3535)
Based on the discussion on stackoverflow:
https://stackoverflow.com/questions/1771679/difference-between-threads-context-class-loader-and-normal-classloader

The Thread.currentThread().getContextClassLoader() should never be used
because the context classloader is not immutable, user can overwrite it
when thread switches, it is also nullable.

The objection here: https://stackoverflow.com/a/36228195 says the
Thread.currentThread().getContextClassLoader() is a JDK design error
and the context classloader is never suggested to be used. The API that
needs classloader should ask the user to set up the right classloader.
2021-08-26 21:00:30 +08:00
Danny Chan
a60fab3a5c [HUDI-2352] The upgrade downgrade action of flink writer should be singleton (#3531) 2021-08-25 10:56:14 +08:00
Danny Chan
05e6f44d53 [MINOR] Fix BatchBootstrapOperator initialization (#3520) 2021-08-22 13:03:22 +08:00
yuzhaojing
ab3fbb8895 [HUDI-2342] Optimize Bootstrap operator (#3516)
Co-authored-by: 喻兆靖 <yuzhaojing@bilibili.com>
2021-08-21 20:03:03 +08:00
Danny Chan
c7c517f14c [HUDI-2340] Merge the data set for flink bounded source when changelog mode turns off (#3513) 2021-08-21 07:21:35 +08:00
Udit Mehrotra
e39d0a2f28 Keep non-conflicting names for common configs between DataSourceOptions and HoodieWriteConfig (#3511) 2021-08-20 02:42:59 -07:00
Udit Mehrotra
c350d05dd3 Restore 0.8.0 config keys with deprecated annotation (#3506)
Co-authored-by: Sagar Sumit <sagarsumit09@gmail.com>
Co-authored-by: Vinoth Chandar <vinoth@apache.org>
2021-08-19 13:36:40 -07:00
Danny Chan
9762e4c08c [MINOR] Some cosmetic changes for Flink (#3503) 2021-08-19 23:21:20 +08:00
swuferhong
1fed44af84 [HUDI-2316] Support Flink batch upsert (#3494) 2021-08-19 17:15:26 +08:00
leiqiang
b7a0d76fc9 [HUDI-2167] HoodieCompactionConfig get HoodieCleaningPolicy NullPointerException
close apache/hudi#3402
2021-08-18 15:40:51 +08:00
Danny Chan
66f951322a [HUDI-2191] Bump flink version to 1.13.1 (#3291) 2021-08-16 18:14:05 +08:00
Udit Mehrotra
3e301196bf Moving to 0.10.0-SNAPSHOT on master branch. 2021-08-14 18:51:09 -07:00
Danny Chan
6a4100bb91 [MINOR] Tweak change log more as FULL for flink streaming source (#3466) 2021-08-13 16:31:16 +08:00
Sagar Sumit
0544d70d8f [MINOR] Deprecate older configs (#3464)
Rename and deprecate props in HoodieWriteConfig

Rename and deprecate older props
2021-08-12 20:31:04 -07:00
Danny Chan
29332498af [HUDI-2298] The HoodieMergedLogRecordScanner should set up the operation of the chosen record (#3456) 2021-08-11 22:55:43 +08:00
swuferhong
21db6d7a84 [HUDI-1771] Propagate CDC format for hoodie (#3285) 2021-08-10 20:23:23 +08:00
yuzhaojing
11ea74958d [HUDI-2247] Filter file where length less than parquet MAGIC length (#3363)
Co-authored-by: 喻兆靖 <yuzhaojing@bilibili.com>
2021-08-09 09:15:42 +08:00
Danny Chan
b7586a5632 [HUDI-2274] Allows INSERT duplicates for Flink MOR table (#3403) 2021-08-06 10:30:52 +08:00
yuzhaojing
b8b9d6db83 [HUDI-2087] Support Append only in Flink stream (#3390)
Co-authored-by: 喻兆靖 <yuzhaojing@bilibili.com>
2021-08-04 17:53:20 +08:00
Danny Chan
02331fc223 [HUDI-2258] Metadata table for flink (#3381) 2021-08-04 10:54:55 +08:00
wenningd
91bb0d1318 [HUDI-2255] Refactor Datasource options (#3373)
Co-authored-by: Wenning Ding <wenningd@amazon.com>
2021-08-03 17:50:30 -07:00
Danny Chan
bec23bda50 [HUDI-2269] Release the disk map resource for flink streaming reader (#3384) 2021-08-03 13:55:35 +08:00
swuferhong
f7f5d4cc6d [HUDI-2184] Support setting hive sync partition extractor class based on flink configuration (#3284) 2021-07-30 17:24:00 +08:00
Danny Chan
c4e45a0010 [HUDI-2254] Builtin sort operator for flink bulk insert (#3372) 2021-07-30 16:58:11 +08:00
swuferhong
8b19ec9ca0 [HUDI-2252] Default consumes from the latest instant for flink streaming reader (#3368) 2021-07-30 14:25:05 +08:00
Danny Chan
efbbb67420 [HUDI-2241] Explicit parallelism for flink bulk insert (#3357) 2021-07-29 09:57:37 +08:00
swuferhong
7739518879 [HUDI-2228] Add option 'hive_sync.mode' for flink writer (#3352) 2021-07-28 19:45:50 +08:00
Danny Chan
91c2213412 [HUDI-2245] BucketAssigner generates the fileId evenly to avoid data skew (#3362) 2021-07-28 19:26:37 +08:00
rmahindra123
8fef50e237 [HUDI-2044] Integrate consumers with rocksDB and compression within External Spillable Map (#3318) 2021-07-28 01:31:03 -04:00
mincwang
00cd35f90a [HUDI-2215] Add rateLimiter when Flink writes to hudi. (#3338)
Co-authored-by: wangminchao <wangminchao@asinking.com>
2021-07-28 08:23:23 +08:00
Danny Chan
60758b36ea [HUDI-2227] Only sync hive meta on successful commit for flink batch writer (#3351) 2021-07-27 20:10:08 +08:00
Danny Chan
9d2a65a6a6 [HUDI-2209] Bulk insert for flink writer (#3334) 2021-07-27 10:58:23 +08:00
xiang2102
024cf01f02 [MINOR] Correct the words accroding in the comments to according (#3343)
Correct the words 'accroding' in the comments to 'according'
2021-07-27 08:48:58 +08:00
rmahindra123
a14b19fdd5 [HUDI-1241] Automate the generation of configs webpage as configs are added to Hudi repo (#3302) 2021-07-23 21:33:34 -07:00
Xuedong Luan
b2f7fcb8c8 [MINOR] Replace deprecated method isDir with isDirectory (#3319) 2021-07-24 10:02:24 +08:00
pengzhiwei
2c910ee3af [HUDI-2212] Missing PrimaryKey In Hoodie Properties For CTAS Table (#3332) 2021-07-23 15:21:57 +08:00