1
0
Commit Graph

105 Commits

Author SHA1 Message Date
Danny Chan
3354fac42f [HUDI-2449] Incremental read for Flink (#3686) 2021-09-19 09:06:46 +08:00
Danny Chan
627f20f9c5 [HUDI-2430] Make decimal compatible with hudi for flink writer (#3658) 2021-09-15 12:04:46 +08:00
Danny Chan
b30c5bdaef [HUDI-2412] Add timestamp based partitioning for flink writer (#3638) 2021-09-11 13:17:16 +08:00
Danny Chan
db2ab9a150 [HUDI-2403] Add metadata table listing for flink query source (#3618) 2021-09-08 14:52:39 +08:00
yuzhaojing
7a1bd225ca [HUDI-2376] Add pipeline for Append mode (#3573)
Co-authored-by: 喻兆靖 <yuzhaojing@bilibili.com>
2021-09-02 16:32:40 +08:00
Danny Chan
f66e1ce9bf [HUDI-2379] Include the pending compaction file groups for flink (#3567)
streaming reader
2021-09-01 16:47:52 +08:00
Danny Chan
57668d02a0 [HUDI-2371] Improvement flink streaming reader (#3552)
- Support reading empty table
- Fix filtering by partition path
- Support reading from earliest commit
2021-08-28 20:16:54 +08:00
mikewu
9850e90e2e [HUDI-2229] Refact HoodieFlinkStreamer to reuse the pipeline of HoodieTableSink (#3495)
Co-authored-by: mikewu <xingbo.wxb@alibaba-inc.com>
2021-08-27 10:14:04 +08:00
yuzhaojing
ab3fbb8895 [HUDI-2342] Optimize Bootstrap operator (#3516)
Co-authored-by: 喻兆靖 <yuzhaojing@bilibili.com>
2021-08-21 20:03:03 +08:00
Danny Chan
c7c517f14c [HUDI-2340] Merge the data set for flink bounded source when changelog mode turns off (#3513) 2021-08-21 07:21:35 +08:00
Udit Mehrotra
c350d05dd3 Restore 0.8.0 config keys with deprecated annotation (#3506)
Co-authored-by: Sagar Sumit <sagarsumit09@gmail.com>
Co-authored-by: Vinoth Chandar <vinoth@apache.org>
2021-08-19 13:36:40 -07:00
Danny Chan
9762e4c08c [MINOR] Some cosmetic changes for Flink (#3503) 2021-08-19 23:21:20 +08:00
swuferhong
1fed44af84 [HUDI-2316] Support Flink batch upsert (#3494) 2021-08-19 17:15:26 +08:00
Danny Chan
66f951322a [HUDI-2191] Bump flink version to 1.13.1 (#3291) 2021-08-16 18:14:05 +08:00
Danny Chan
29332498af [HUDI-2298] The HoodieMergedLogRecordScanner should set up the operation of the chosen record (#3456) 2021-08-11 22:55:43 +08:00
swuferhong
21db6d7a84 [HUDI-1771] Propagate CDC format for hoodie (#3285) 2021-08-10 20:23:23 +08:00
Danny Chan
b7586a5632 [HUDI-2274] Allows INSERT duplicates for Flink MOR table (#3403) 2021-08-06 10:30:52 +08:00
yuzhaojing
b8b9d6db83 [HUDI-2087] Support Append only in Flink stream (#3390)
Co-authored-by: 喻兆靖 <yuzhaojing@bilibili.com>
2021-08-04 17:53:20 +08:00
Danny Chan
02331fc223 [HUDI-2258] Metadata table for flink (#3381) 2021-08-04 10:54:55 +08:00
swuferhong
f7f5d4cc6d [HUDI-2184] Support setting hive sync partition extractor class based on flink configuration (#3284) 2021-07-30 17:24:00 +08:00
Danny Chan
c4e45a0010 [HUDI-2254] Builtin sort operator for flink bulk insert (#3372) 2021-07-30 16:58:11 +08:00
swuferhong
8b19ec9ca0 [HUDI-2252] Default consumes from the latest instant for flink streaming reader (#3368) 2021-07-30 14:25:05 +08:00
Danny Chan
91c2213412 [HUDI-2245] BucketAssigner generates the fileId evenly to avoid data skew (#3362) 2021-07-28 19:26:37 +08:00
rmahindra123
8fef50e237 [HUDI-2044] Integrate consumers with rocksDB and compression within External Spillable Map (#3318) 2021-07-28 01:31:03 -04:00
Danny Chan
9d2a65a6a6 [HUDI-2209] Bulk insert for flink writer (#3334) 2021-07-27 10:58:23 +08:00
pengzhiwei
2c910ee3af [HUDI-2212] Missing PrimaryKey In Hoodie Properties For CTAS Table (#3332) 2021-07-23 15:21:57 +08:00
Danny Chan
c89bf1de20 [HUDI-2205] Rollback inflight compaction for flink writer (#3320) 2021-07-22 22:56:51 +08:00
Danny Chan
858e84b5b2 [HUDI-2198] Clean and reset the bootstrap events for coordinator when task failover (#3304) 2021-07-21 10:13:05 +08:00
yuzhaojing
634163a990 [HUDI-2145] Create new bucket when NewFileAssignState filled (#3258)
Co-authored-by: 喻兆靖 <yuzhaojing@bilibili.com>
2021-07-20 17:46:45 +08:00
喻兆靖
2099bf41db [HUDI-2193] Remove state in BootstrapFunction 2021-07-19 18:14:06 +08:00
yuzhao.cyz
50c2b76d72 Revert "[HUDI-2087] Support Append only in Flink stream (#3252)"
This reverts commit 783c9cb3
2021-07-16 21:36:27 +08:00
yuzhao.cyz
c8aaf00819 [HUDI-2185] Remove the default parallelism of index bootstrap and bucket assigner 2021-07-16 15:44:15 +08:00
vinoyang
52524b659d [HUDI-2165] Support Transformer for HoodieFlinkStreamer (#3270)
* [HUDI-2165] Support Transformer for HoodieFlinkStreamer
2021-07-14 23:01:52 +08:00
yuzhaojing
783c9cb369 [HUDI-2087] Support Append only in Flink stream (#3252)
Co-authored-by: 喻兆靖 <yuzhaojing@bilibili.com>
2021-07-10 14:49:35 +08:00
vinoyang
7c6eebf98c [MINOR] Fix some wrong assert reasons (#3248) 2021-07-10 14:35:40 +08:00
vinoth chandar
b4562e86e4 Revert "[HUDI-2087] Support Append only in Flink stream (#3174)" (#3251)
This reverts commit 371526789d.
2021-07-09 11:20:09 -07:00
yuzhaojing
371526789d [HUDI-2087] Support Append only in Flink stream (#3174)
Co-authored-by: 喻兆靖 <yuzhaojing@bilibili.com>
2021-07-09 16:06:32 +08:00
wangxianghu
f2621da32f [HUDI-2093] Fix empty avro schema path caused by duplicate parameters (#3177)
* [HUDI-2093] Fix empty avro schema path caused by duplicate parameters

* rename shcmea option key

* fix doc

* rename var name
2021-07-06 15:14:30 +08:00
Danny Chan
32bd8ce088 [HUDI-2132] Make coordinator events as POJO for efficient serialization (#3223) 2021-07-06 09:02:38 +08:00
Danny Chan
e6ee7bdb51 [HUDI-2129] StreamerUtil.medianInstantTime should return a valid date time string (#3221) 2021-07-05 20:56:24 +08:00
Danny Chan
7462fdefc3 [HUDI-2112] Support reading pure logs file group for flink batch reader after compaction (#3202) 2021-07-02 16:29:22 +08:00
pengzhiwei
b34d53fa9c [HUDI-2088] Missing Partition Fields And PreCombineField In Hoodie Properties For Table Written By Flink (#3171) 2021-07-01 17:25:18 +08:00
yuzhaojing
07e93de8b4 [HUDI-2052] Support load logFile in BootstrapFunction (#3134)
Co-authored-by: 喻兆靖 <yuzhaojing@bilibili.com>
2021-06-30 20:37:00 +08:00
Danny Chan
b8a8f572d6 [HUDI-2094] Supports hive style partitioning for flink writer (#3178) 2021-06-29 15:34:26 +08:00
yuzhaojing
37b7c65d8a [HUDI-2084] Resend the uncommitted write metadata when start up (#3168)
Co-authored-by: 喻兆靖 <yuzhaojing@bilibili.com>
2021-06-29 08:53:52 +08:00
Danny Chan
cdb9b48170 [HUDI-2040] Make flink writer as exactly-once by default (#3106) 2021-06-18 13:55:23 +08:00
Danny Chan
aa6342c3c9 [HUDI-2036] Move the compaction plan scheduling out of flink writer coordinator (#3101)
Since HUDI-1955 was fixed, we can move the scheduling out if the
coordinator to make the coordinator more lightweight.
2021-06-18 09:35:09 +08:00
yuzhaojing
f97dd25d41 [HUDI-2019] Set up the file system view storage config for singleton embedded server write config every time (#3102)
Co-authored-by: 喻兆靖 <yuzhaojing@bilibili.com>
2021-06-17 20:28:03 +08:00
Danny Chan
6763b45dd4 [HUDI-2030] Add metadata cache to WriteProfile to reduce IO (#3090)
Keeps same number of instant metadata cache and refresh the cache on new
commits.
2021-06-17 19:10:34 +08:00
Danny Chan
cb642ceb75 [HUDI-1999] Refresh the base file view cache for WriteProfile (#3067)
Refresh the view to discover new small files.
2021-06-15 08:18:38 -07:00