1
0
Commit Graph

267 Commits

Author SHA1 Message Date
Danny Chan
e8473b9a2b [HUDI-2951] Disable remote view storage config for flink (#4237) 2021-12-07 18:04:15 +08:00
Ron
a8fb69656f [HUDI-2877] Support flink catalog to help user use flink table conveniently (#4153)
* [HUDI-2877] Support flink catalog to help user use flink table conveniently

* Fix comment

* fix comment2
2021-12-05 10:14:29 +08:00
Danny Chan
0699521f83 [HUDI-2924] Refresh the fs view on successful checkpoints for write profile (#4199) 2021-12-03 16:12:59 +08:00
Danny Chan
f74b3d12aa [minor] Refactor write profile to always generate fs view (#4198) 2021-12-03 11:38:29 +08:00
Danny Chan
934fe54cc5 [HUDI-2914] Fix remote timeline server config for flink (#4191) 2021-12-03 08:59:10 +08:00
yuzhao.cyz
a1d0ff4209 Moving to 0.11.0-SNAPSHOT on master branch. 2021-11-27 17:22:10 +08:00
Danny Chan
e9efbdb63c [HUDI-2863] Rename option 'hoodie.parquet.page.size' to 'write.parquet.page.size' (#4128) 2021-11-26 16:40:53 +08:00
Alexey Kudinkin
6f5d8d04cd [HUDI-2840] Fixed DeltaStreaemer to properly respect configuration passed t/h properties file (#4090)
* Rebased `DFSPropertiesConfiguration` to access Hadoop config in liue of FS to avoid confusion

* Fixed `readConfig` to take Hadoop's `Configuration` instead of FS;
Fixing usages

* Added test for local FS access

* Rebase to use `FSUtils.getFs`

* Combine properties provided as a file along w/ overrides provided from the CLI

* Added helper utilities to `HoodieClusteringConfig`;
Make sure corresponding config methods fallback to defaults;

* Fixed DeltaStreamer usage to respect properly combined configuration;
Abstracted `HoodieClusteringConfig.from` convenience utility to init Clustering config from `Properties`

* Tidying up

* `lint`

* Reverting changes to `HoodieWriteConfig`

* Tdiying up

* Fixed incorrect merge of the props

* Converted `HoodieConfig` to wrap around `Properties` into `TypedProperties`

* Fixed compilation

* Fixed compilation
2021-11-25 14:48:22 -08:00
Danny Chan
a2eb2b0b0a [HUDI-2480] FileSlice after pending compaction-requested instant-time… (#3703)
* [HUDI-2480] FileSlice after pending compaction-requested instant-time is ignored by MOR snapshot reader

* include file slice after a pending compaction for spark reader

Co-authored-by: garyli1019 <yanjia.gary.li@gmail.com>
2021-11-25 22:30:09 +08:00
Danny Chan
0bb506fa00 [HUDI-2847] Flink metadata table supports virtual keys (#4096) 2021-11-24 17:34:42 +08:00
Danny Chan
323be33f18 Revert "[HUDI-2799] Fix the classloader of flink write task (#4042)" (#4069)
This reverts commit 8281cbf762.
2021-11-24 12:01:18 +08:00
Sivabalan Narayanan
fc9ca6a07a [HUDI-2559] Converting commit timestamp format to millisecs (#4024)
- Adds support for generating commit timestamps with millisecs granularity. 
- Older commit timestamps (in secs granularity) will be suffixed with 999 and parsed with millisecs format.
2021-11-22 11:44:38 -05:00
Danny Chan
8281cbf762 [HUDI-2799] Fix the classloader of flink write task (#4042) 2021-11-22 11:05:05 +08:00
Danny Chan
520538b15d [HUDI-2392] Make flink parquet reader compatible with decimal BINARY encoding (#4057) 2021-11-21 13:27:18 +08:00
Danny Chan
0411f73c7d [HUDI-2804] Add option to skip compaction instants for streaming read (#4051) 2021-11-21 12:38:56 +08:00
Danny Chan
bf008762df [HUDI-2798] Fix flink query operation fields (#4041) 2021-11-19 23:39:37 +08:00
Danny Chan
7a00f867ae [HUDI-2791] Allows duplicate files for metadata commit (#4033) 2021-11-19 14:30:17 +08:00
wenningd
24def0b30d [HUDI-2362] Add external config file support (#3416)
Co-authored-by: Wenning Ding <wenningd@amazon.com>
2021-11-18 01:59:26 -08:00
Danny Chan
8772cec4bd [HUDI-2790] Fix the changelog mode of HoodieTableSource (#4029) 2021-11-18 16:40:48 +08:00
Danny Chan
71a2ae0fd6 [HUDI-2789] Flink batch upsert for non partitioned table does not work (#4028) 2021-11-18 13:59:03 +08:00
0x574C
aec5d11da2 Check --source-avro-schema-path parameter (#3987)
Co-authored-by: 0x3E6 <dragon1996>
2021-11-17 14:45:43 +08:00
Danny Chan
6f5e661010 [HUDI-2769] Fix StreamerUtil#medianInstantTime for very near instant time (#4005) 2021-11-16 13:46:34 +08:00
Danny Chan
c2f9094b49 [HUDI-2756] Fix flink parquet writer decimal type conversion (#3988) 2021-11-14 08:51:54 +08:00
Danny Chan
bc511edc85 [HUDI-2746] Do not bootstrap for flink insert overwrite (#3980) 2021-11-12 12:17:58 +08:00
yuzhaojing
6b93ccca9b [HUDI-2738] Remove the bucketAssignFunction useless context (#3972)
Co-authored-by: yuzhaojing <yuzhaojing@bytedance.com>
2021-11-11 21:03:01 +08:00
yuzhaojing
90f9b4562a [HUDI-2685] Support scheduling online compaction plan when there are no commit data (#3928)
Co-authored-by: yuzhaojing <yuzhaojing@bytedance.com>
2021-11-11 10:13:21 +08:00
Danny Chan
e057a10499 [HUDI-2715] The BitCaskDiskMap iterator may cause memory leak (#3951) 2021-11-09 15:40:00 +08:00
yuzhaojing
7aaf47e716 [HUDI-2698] Remove the table source options validation (#3940)
Co-authored-by: yuzhaojing <yuzhaojing@bytedance.com>
2021-11-08 16:56:03 +08:00
Danny Chan
c7bf2c7687 [HUDI-2709] Add more options when initializing table (#3939) 2021-11-08 15:08:49 +08:00
Danny Chan
9a8963d05e [HUDI-2702] Set up keygen class explicit for write config for flink table upgrade (#3931) 2021-11-06 12:23:15 +08:00
Prashant Wason
b7ee341e14 [HUDI-1794] Moved static COMMIT_FORMATTER to thread local variable as SimpleDateFormat is not thread safe. (#2819) 2021-11-05 09:31:42 -04:00
Danny Chan
3af6568d31 [HUDI-2696] Remove the aborted checkpoint notification from coordinator (#3926) 2021-11-05 16:37:23 +08:00
yuzhaojing
f67da0c7d0 [HUDI-2686] Proccess record after all bootstrap operator ready (#3925)
Co-authored-by: yuzhaojing <yuzhaojing@bytedance.com>
2021-11-05 14:36:22 +08:00
yuzhaojing
2c1e259329 [HUDI-2651] Sync all the missing sql options for HoodieFlinkStreamer (#3903)
Co-authored-by: yuzhaojing <yuzhaojing@bytedance.com>
2021-11-05 12:16:21 +08:00
Danny Chan
33436aa359 Revert "[HUDI-2677] Add DFS based message queue for flink writer (#3915)" (#3923)
This reverts commit dbf8c44bdb.
2021-11-04 20:48:57 +08:00
Danny Chan
dbf8c44bdb [HUDI-2677] Add DFS based message queue for flink writer (#3915) 2021-11-04 18:09:00 +08:00
Danny Chan
689020f303 [HUDI-2684] Use DefaultHoodieRecordPayload when precombine field is specified specifically (#3922) 2021-11-04 16:23:36 +08:00
Danny Chan
7fc7e9b2bc [HUDI-2660] Delete the view storage properties first before creation (#3899) 2021-11-03 14:30:20 +08:00
Danny Chan
87c6f9cd07 [HUDI-2654] Add compaction failed event(part2) (#3896) 2021-10-31 17:51:11 +08:00
Danny Chan
92a3c458bd [HUDI-2654] Schedules the compaction from earliest for flink (#3891) 2021-10-30 08:37:30 +08:00
Danny Chan
e5b6b8602c [HUDI-2633] Make precombine field optional for flink (#3874) 2021-10-28 13:52:06 +08:00
Danny Chan
909c3ba45e [HUDI-2632] Schema evolution for flink parquet reader (#3872) 2021-10-27 20:00:24 +08:00
mincwang
91845e241d [MINOR] Show source table operator details on the flink web when reading hudi table (#3842) 2021-10-24 23:18:01 +08:00
Y Ethan Guo
5ed35bff83 [HUDI-2501] Add HoodieData abstraction and refactor compaction actions in hudi-client module (#3741) 2021-10-22 15:58:51 -04:00
Danny Chan
aa3c4ecda5 [HUDI-2583] Refactor TestWriteCopyOnWrite test cases (#3832) 2021-10-21 12:36:41 +08:00
Danny Chan
e355ab52db [HUDI-2578] Support merging small files for flink insert operation (#3822) 2021-10-20 21:10:07 +08:00
Danny Chan
3a78be9203 [HUDI-2572] Strength flink compaction rollback strategy (#3819)
* make the events of commit task distinct by file id
* fix the existence check for inflight state file
* make the compaction task fail-safe
2021-10-19 10:47:38 +08:00
Danny Chan
3025f4d796 [HUDI-2568] Simplify the view storage config properties (#3815) 2021-10-18 14:42:33 +08:00
Danny Chan
2eda3de7f9 [HUDI-2562] Embedded timeline server on JobManager (#3812) 2021-10-18 10:45:39 +08:00
Danny Chan
2c370cbae0 [HUDI-2556] Tweak some default config options for flink (#3800)
* rename write.insert.drop.duplicates to write.precombine and set it as true for COW table
* set index.global.enabled default as true
* set compaction.target_io default as 500GB
2021-10-14 19:42:56 +08:00