Danny Chan
|
c4e45a0010
|
[HUDI-2254] Builtin sort operator for flink bulk insert (#3372)
|
2021-07-30 16:58:11 +08:00 |
|
swuferhong
|
8b19ec9ca0
|
[HUDI-2252] Default consumes from the latest instant for flink streaming reader (#3368)
|
2021-07-30 14:25:05 +08:00 |
|
Danny Chan
|
91c2213412
|
[HUDI-2245] BucketAssigner generates the fileId evenly to avoid data skew (#3362)
|
2021-07-28 19:26:37 +08:00 |
|
rmahindra123
|
8fef50e237
|
[HUDI-2044] Integrate consumers with rocksDB and compression within External Spillable Map (#3318)
|
2021-07-28 01:31:03 -04:00 |
|
Danny Chan
|
9d2a65a6a6
|
[HUDI-2209] Bulk insert for flink writer (#3334)
|
2021-07-27 10:58:23 +08:00 |
|
pengzhiwei
|
2c910ee3af
|
[HUDI-2212] Missing PrimaryKey In Hoodie Properties For CTAS Table (#3332)
|
2021-07-23 15:21:57 +08:00 |
|
Danny Chan
|
c89bf1de20
|
[HUDI-2205] Rollback inflight compaction for flink writer (#3320)
|
2021-07-22 22:56:51 +08:00 |
|
Danny Chan
|
858e84b5b2
|
[HUDI-2198] Clean and reset the bootstrap events for coordinator when task failover (#3304)
|
2021-07-21 10:13:05 +08:00 |
|
yuzhaojing
|
634163a990
|
[HUDI-2145] Create new bucket when NewFileAssignState filled (#3258)
Co-authored-by: 喻兆靖 <yuzhaojing@bilibili.com>
|
2021-07-20 17:46:45 +08:00 |
|
喻兆靖
|
2099bf41db
|
[HUDI-2193] Remove state in BootstrapFunction
|
2021-07-19 18:14:06 +08:00 |
|
yuzhao.cyz
|
50c2b76d72
|
Revert "[HUDI-2087] Support Append only in Flink stream (#3252)"
This reverts commit 783c9cb3
|
2021-07-16 21:36:27 +08:00 |
|
yuzhao.cyz
|
c8aaf00819
|
[HUDI-2185] Remove the default parallelism of index bootstrap and bucket assigner
|
2021-07-16 15:44:15 +08:00 |
|
vinoyang
|
52524b659d
|
[HUDI-2165] Support Transformer for HoodieFlinkStreamer (#3270)
* [HUDI-2165] Support Transformer for HoodieFlinkStreamer
|
2021-07-14 23:01:52 +08:00 |
|
yuzhaojing
|
783c9cb369
|
[HUDI-2087] Support Append only in Flink stream (#3252)
Co-authored-by: 喻兆靖 <yuzhaojing@bilibili.com>
|
2021-07-10 14:49:35 +08:00 |
|
vinoyang
|
7c6eebf98c
|
[MINOR] Fix some wrong assert reasons (#3248)
|
2021-07-10 14:35:40 +08:00 |
|
vinoth chandar
|
b4562e86e4
|
Revert "[HUDI-2087] Support Append only in Flink stream (#3174)" (#3251)
This reverts commit 371526789d.
|
2021-07-09 11:20:09 -07:00 |
|
yuzhaojing
|
371526789d
|
[HUDI-2087] Support Append only in Flink stream (#3174)
Co-authored-by: 喻兆靖 <yuzhaojing@bilibili.com>
|
2021-07-09 16:06:32 +08:00 |
|
wangxianghu
|
f2621da32f
|
[HUDI-2093] Fix empty avro schema path caused by duplicate parameters (#3177)
* [HUDI-2093] Fix empty avro schema path caused by duplicate parameters
* rename shcmea option key
* fix doc
* rename var name
|
2021-07-06 15:14:30 +08:00 |
|
Danny Chan
|
32bd8ce088
|
[HUDI-2132] Make coordinator events as POJO for efficient serialization (#3223)
|
2021-07-06 09:02:38 +08:00 |
|
Danny Chan
|
e6ee7bdb51
|
[HUDI-2129] StreamerUtil.medianInstantTime should return a valid date time string (#3221)
|
2021-07-05 20:56:24 +08:00 |
|
Danny Chan
|
7462fdefc3
|
[HUDI-2112] Support reading pure logs file group for flink batch reader after compaction (#3202)
|
2021-07-02 16:29:22 +08:00 |
|
pengzhiwei
|
b34d53fa9c
|
[HUDI-2088] Missing Partition Fields And PreCombineField In Hoodie Properties For Table Written By Flink (#3171)
|
2021-07-01 17:25:18 +08:00 |
|
yuzhaojing
|
07e93de8b4
|
[HUDI-2052] Support load logFile in BootstrapFunction (#3134)
Co-authored-by: 喻兆靖 <yuzhaojing@bilibili.com>
|
2021-06-30 20:37:00 +08:00 |
|
Danny Chan
|
b8a8f572d6
|
[HUDI-2094] Supports hive style partitioning for flink writer (#3178)
|
2021-06-29 15:34:26 +08:00 |
|
yuzhaojing
|
37b7c65d8a
|
[HUDI-2084] Resend the uncommitted write metadata when start up (#3168)
Co-authored-by: 喻兆靖 <yuzhaojing@bilibili.com>
|
2021-06-29 08:53:52 +08:00 |
|
Danny Chan
|
cdb9b48170
|
[HUDI-2040] Make flink writer as exactly-once by default (#3106)
|
2021-06-18 13:55:23 +08:00 |
|
Danny Chan
|
aa6342c3c9
|
[HUDI-2036] Move the compaction plan scheduling out of flink writer coordinator (#3101)
Since HUDI-1955 was fixed, we can move the scheduling out if the
coordinator to make the coordinator more lightweight.
|
2021-06-18 09:35:09 +08:00 |
|
yuzhaojing
|
f97dd25d41
|
[HUDI-2019] Set up the file system view storage config for singleton embedded server write config every time (#3102)
Co-authored-by: 喻兆靖 <yuzhaojing@bilibili.com>
|
2021-06-17 20:28:03 +08:00 |
|
Danny Chan
|
6763b45dd4
|
[HUDI-2030] Add metadata cache to WriteProfile to reduce IO (#3090)
Keeps same number of instant metadata cache and refresh the cache on new
commits.
|
2021-06-17 19:10:34 +08:00 |
|
Danny Chan
|
cb642ceb75
|
[HUDI-1999] Refresh the base file view cache for WriteProfile (#3067)
Refresh the view to discover new small files.
|
2021-06-15 08:18:38 -07:00 |
|
swuferhong
|
0c4f2fdc15
|
[HUDI-1984] Support independent flink hudi compaction function (#3046)
|
2021-06-13 15:04:46 +08:00 |
|
yuzhaojing
|
728089a888
|
delete duplicate bootstrap function (#3052)
Co-authored-by: 喻兆靖 <yuzhaojing@bilibili.com>
|
2021-06-09 19:29:57 +08:00 |
|
yuzhaojing
|
cf83f10f5b
|
add BootstrapFunction to support index bootstrap (#3024)
Co-authored-by: 喻兆靖 <yuzhaojing@bilibili.com>
|
2021-06-08 13:55:25 +08:00 |
|
Danny Chan
|
08464a6a5b
|
[HUDI-1931] BucketAssignFunction use ValueState instead of MapState (#3026)
Co-authored-by: 854194341@qq.com <loukey_7821>
|
2021-06-06 10:40:15 +08:00 |
|
Danny Chan
|
a658328001
|
[HUDI-1961] Add a debezium json integration test case for flink (#3030)
|
2021-06-04 15:15:32 +08:00 |
|
taylorliao
|
86007e9a13
|
[HUDI-1953] Fix NPE due to not set the output type of the operator (#3023)
Co-authored-by: enter58xuan <enter58xuan@zto.com>
|
2021-06-03 14:20:57 +08:00 |
|
Danny Chan
|
bf1cfb5635
|
[HUDI-1949] Refactor BucketAssigner to make it more efficient (#3017)
Add a process single class WriteProfile, the record and small files
profile re-construction can be more efficient if we reuse by same
checkpoint id.
|
2021-06-02 09:12:35 +08:00 |
|
yuzhaojing
|
bc18c39835
|
[FLINK-1923] Exactly-once write for flink writer (#3002)
Co-authored-by: 喻兆靖 <yuzhaojing@bilibili.com>
|
2021-05-28 14:58:21 +08:00 |
|
Town
|
aba1eadbfc
|
[HUDI-1919] Type mismatch when streaming read copy_on_write table using flink (#2986)
* [HUDI-1919] Type mismatch when streaming read copy_on_write table using flink #2976
* Update ParquetSplitReaderUtil.java
|
2021-05-25 11:36:43 +08:00 |
|
Danny Chan
|
9b01d2f864
|
[HUDI-1915] Fix the file id for write data buffer before flushing (#2966)
|
2021-05-20 10:20:08 +08:00 |
|
Danny Chan
|
7d2971d4e2
|
[HUDI-1911] Reuse the partition path and file group id for flink write data buffer (#2961)
Reuse to reduce memory footprint.
|
2021-05-18 17:47:22 +08:00 |
|
Danny Chan
|
46a2399a45
|
[HUDI-1902] Global index for flink writer (#2958)
Supports deduplication for record keys with different partition path.
|
2021-05-18 13:55:38 +08:00 |
|
Danny Chan
|
ad77cf42ba
|
[HUDI-1900] Always close the file handle for a flink mini-batch write (#2943)
Close the file handle eagerly to avoid corrupted files as much as
possible.
|
2021-05-14 10:25:18 +08:00 |
|
Danny Chan
|
b98c9ab439
|
[HUDI-1895] Close the file handles gracefully for flink write function to avoid corrupted files (#2938)
|
2021-05-12 18:44:10 +08:00 |
|
TeRS-K
|
be9db2c4f5
|
[HUDI-1055] Remove hardcoded parquet in tests (#2740)
* Remove hardcoded parquet in tests
* Use DataFileUtils.getInstance
* Renaming DataFileUtils to BaseFileUtils
Co-authored-by: Vinoth Chandar <vinoth@apache.org>
|
2021-05-11 10:01:45 -07:00 |
|
hiscat
|
7a5af806cf
|
[HUDI-1818] Validate required fields for Flink HoodieTable (#2930)
|
2021-05-11 11:11:19 +08:00 |
|
Danny Chan
|
c1b331bcff
|
[HUDI-1886] Avoid to generates corrupted files for flink sink (#2929)
|
2021-05-10 10:43:03 +08:00 |
|
Danny Chan
|
bfbf993cbe
|
[HUDI-1878] Add max memory option for flink writer task (#2920)
Also removes the rate limiter because it has the similar functionality,
modify the create and merge handle cleans the retry files automatically.
|
2021-05-08 14:27:56 +08:00 |
|
Danny Chan
|
528f4ca988
|
[HUDI-1880] Support streaming read with compaction and cleaning (#2921)
|
2021-05-07 20:04:35 +08:00 |
|
hiscat
|
0a5863939b
|
[HUDI-1821] Remove legacy code for Flink writer (#2868)
|
2021-05-07 10:58:49 +08:00 |
|