Raymond Xu
46de9e0f3f
[HUDI-1810] Fix azure setting for integ tests ( #2889 )
2021-04-30 11:17:36 -07:00
Raymond Xu
faf3785a2d
[HUDI-1811] Fix TestHoodieRealtimeRecordReader ( #2873 )
...
Pass basePath with scheme 'file://' to HoodieRealtimeFileSplit
2021-04-30 11:16:55 -07:00
xiarixiaoyao
929eca43fe
[HUDI-1817] Fix getting incorrect partition path while using incr query by spark-sql ( #2858 )
2021-04-30 14:57:52 +08:00
Danny Chan
6848a683bd
[HUDI-1867] Streaming read for Flink COW table ( #2895 )
...
Supports streaming read for Copy On Write table.
2021-04-29 20:44:45 +08:00
Danny Chan
6e9c5dd765
[HUDI-1863] Add rate limiter to Flink writer to avoid OOM for bootstrap ( #2891 )
2021-04-29 20:32:10 +08:00
pengzhiwei
c9bcb5e33f
[HUDI-1845] Exception Throws When Sync Non-Partitioned Table To Hive With MultiPartKeysValueExtractor ( #2876 )
2021-04-28 19:11:46 -07:00
dijie
3ca9030256
[HUDI-1858] Fix cannot create table due to jar conflict ( #2886 )
...
Co-authored-by: 狄杰 <shenjinxin@accesscorporate.com.cn >
2021-04-28 14:10:04 +08:00
satishkotha
386767693d
[HUDI-1833] rollback pending clustering even if there is greater commit ( #2863 )
...
* [HUDI-1833] rollback pending clustering even if there are greater commits
2021-04-27 14:21:42 -07:00
Roc Marshal
e4fd195d9f
[MINOR] Refactor method up to parent-class ( #2822 )
2021-04-27 21:32:32 +08:00
satishkotha
2999586509
[HUDI-1690] use jsc union instead of rdd union ( #2872 )
2021-04-26 23:35:01 -07:00
hiscat
63fa2b6186
[HUDI-1836] Logging consuming instant to StreamReadOperator#processSplits ( #2867 )
2021-04-27 14:00:59 +08:00
Danny Chan
5be3997f70
[HUDI-1841] Tweak the min max commits to keep when setting up cleaning retain commits for Flink ( #2875 )
2021-04-27 10:58:06 +08:00
Roc Marshal
9bbb458e88
[MINOR] Remove redundant method-calling. ( #2881 )
2021-04-27 09:34:09 +08:00
Nick Young
f4e3b94971
[HUDI-1742] Improve table level config priority for HoodieMultiTableDeltaStreamer ( #2744 )
2021-04-26 22:05:06 +08:00
Danny Chan
d047e91d86
[HUDI-1837] Add optional instant range to log record scanner for log ( #2870 )
2021-04-26 16:53:18 +08:00
Sivabalan Narayanan
3e4fa170cf
[HUDI-1835] Fixing kafka native config param for auto offset reset ( #2864 )
2021-04-25 12:16:09 -04:00
Danny Chan
1b27259b53
[HUDI-1844] Add option to flush when total buckets memory exceeds the threshold ( #2877 )
...
Current code supports flushing as per-bucket memory usage, while the
buckets may still take too much memory for bootstrap from history data.
When the threshold hits, flush out half of the buckets with bigger
buffer size.
2021-04-25 23:06:53 +08:00
Danny Chan
a5789c4067
[HUDI-1829] Use while loop instead of recursive call in MergeOnReadInputFormat to avoid StackOverflow ( #2862 )
...
Recursive all is risky for StackOverflow when there are too many.
2021-04-23 09:59:36 +08:00
Chanh Le
a1e636dc6b
[HUDI-1551] Add support for BigDecimal and Integer when partitioning based on time. ( #2851 )
...
Co-authored-by: trungchanh.le <trungchanh.le@bybit.com >
2021-04-22 21:56:20 +08:00
jsbali
4a3431866d
[HUDI-1746] Added support for replace commits in commit showpartitions, commit show_write_stats, commit showfiles ( #2678 )
...
* Added support for replace commits in commit showpartitions, commit show_write_stats, commit showfiles
* Adding CR changes
* [HUDI-1746] Code review changes
2021-04-21 10:31:35 -07:00
jsbali
b31c520c66
[HUDI-1714] Added tests to TestHoodieTimelineArchiveLog for the archival of compl… ( #2677 )
...
* Added tests to TestHoodieTimelineArchiveLog for the archival of completed clean and rollback actions.
* Adding code review changes
* [HUDI-1714] Minor Fixes
2021-04-21 10:27:43 -07:00
vinoyang
c24d90d25a
[MINOR] Expose the detailed exception object ( #2861 )
2021-04-21 22:41:42 +08:00
hiscat
cc81ddde01
[HUDI-1812] Add explicit index state TTL option for Flink writer ( #2853 )
2021-04-21 20:13:30 +08:00
Danny Chan
ac3589f006
[HUDI-1814] Non partitioned table for Flink writer ( #2859 )
2021-04-21 20:07:27 +08:00
pengzhiwei
aacb8be521
[HUDI-1415] Read Hoodie Table As Spark DataSource Table ( #2283 )
2021-04-20 14:21:38 -07:00
Jintao Guan
3253079507
[HUDI-1764] Add Hudi-CLI support for clustering ( #2773 )
...
* tmp base
* update
* update unit test
* update
* update
* update CLI parameters
* linting
* update doSchedule in HoodieClusteringJob
* update
* update diff according to comments
2021-04-20 09:46:42 -07:00
Danny Chan
d6d52c6063
[HUDI-1809] Flink merge on read input split uses wrong base file path for default merge type ( #2846 )
2021-04-20 21:27:09 +08:00
Sebastian Bernauer
9a288ccbeb
[MINOR] Added metric reporter Prometheus to HoodieBackedTableMetadataWriter ( #2842 )
2021-04-19 16:04:59 -07:00
li36909
6b4b878d08
[HUDI-1744] rollback fails on mor table when the partition path hasn't any files ( #2749 )
...
Co-authored-by: lrz <lrz@lrzdeMacBook-Pro.local >
2021-04-19 15:44:11 -07:00
Thinking Chen
d21753d903
[HUDI-1802] Timeline Server Bundle need to include com.esotericsoftware package ( #2835 )
2021-04-19 09:27:58 -07:00
Aditya Tiwari
ec2334ceac
[HUDI-1716]: Resolving default values for schema from dataframe ( #2765 )
...
- Adding default values and setting null as first entry in UNION data types in avro schema.
Co-authored-by: Aditya Tiwari <aditya.tiwari@flipkart.com >
2021-04-19 10:05:20 -04:00
Danny Chan
dab5114f16
[HUDI-1804] Continue to write when Flink write task restart because of container killing ( #2843 )
...
The `FlinkMergeHande` creates a marker file under the metadata path
each time it initializes, when a write task restarts from killing, it
tries to create the existing file and reports error.
To solve this problem, skip the creation and use the original data file
as base file to merge.
2021-04-19 19:43:41 +08:00
Roc Marshal
f7b6b68063
[MINOR][hudi-sync] Fix typos ( #2844 )
2021-04-19 16:27:13 +08:00
satishkotha
4e050cc2ba
[MINOR] Add jackson module to presto bundle ( #2816 )
2021-04-17 13:26:07 -07:00
Xu Guang Lv
1d53d6e6c2
[HUDI-1803] Support BAIDU AFS storage format in hudi ( #2836 )
2021-04-16 16:43:14 +08:00
hj2016
62b8a341dd
[HUDI-1792] flink-client query error when processing files larger than 128mb ( #2814 )
...
Co-authored-by: huangjing <huangjing@clinbrain.com >
2021-04-16 13:59:19 +08:00
Danny Chan
b6d949b48a
[HUDI-1801] FlinkMergeHandle rolling over may miss to rename the latest file handle ( #2831 )
...
The FlinkMergeHandle may rename the N-1 th file handle instead of the
latest one, thus to cause data duplication.
2021-04-16 11:40:53 +08:00
MINCWANG
191470d1fc
[HUDI-1797] Remove the com.google.guave jar from hudi-flink-bundle to avoid conflicts. ( #2828 )
...
Co-authored-by: wangminchao <wangminchao@asinking.com >
2021-04-15 15:16:33 +08:00
hiscat
6d1aec604f
[HUDI-1798] Flink streaming reader should always monitor the delta commits files ( #2825 )
...
The streaming reader should only monitor the delta log files, if there are parquet commits but we recognize as logs, the reader would report FileNotFound exception.
2021-04-15 13:50:17 +08:00
Roc Marshal
62bb9e10d9
[Hotfix][utilities] Optimized codes ( #2821 )
2021-04-15 09:40:14 +08:00
Sivabalan Narayanan
8d29863c86
[HUDI-1615] Fixing usage of NULL schema for delete operation in HoodieSparkSqlWriter ( #2777 )
2021-04-14 15:35:39 +08:00
Danny Chan
ab4a7b0b4a
[HUDI-1788] Insert overwrite (table) for Flink writer ( #2808 )
...
Supports `INSERT OVERWRITE` and `INSERT OVERWRITE TABLE` for Flink
writer.
2021-04-14 10:23:37 +08:00
xiarixiaoyao
65844a8d29
[HUDI-1720] Fix RealtimeCompactedRecordReader StackOverflowError ( #2721 )
2021-04-13 18:23:26 +08:00
hiscat
e16d31dce2
[HUDI-1787] Remove the rocksdb jar from hudi-flink-bundle ( #2807 )
...
Remove the RocksDB jar from hudi-flink-bundle to avoid conflicts.
2021-04-13 10:31:16 +08:00
Danny Chan
1ff99ca7d7
[HUDI-1786] Add option for merge max memory ( #2805 )
2021-04-12 17:03:58 +08:00
wangxianghu
040756d8c0
[HUDI-1785] Move OperationConverter to hudi-client-common for code reuse ( #2798 )
2021-04-12 16:22:33 +08:00
hj2016
1da16dfd2e
[HUDI-1784] Added print detailed stack log when hbase connection error ( #2799 )
2021-04-12 13:46:06 +08:00
wangxianghu
f3777f44fe
[MINOR] Remove unused imports and some other checkstyle issues ( #2800 )
2021-04-11 21:42:34 +08:00
Roc Marshal
b554835053
[MINOR] fix typo. ( #2804 )
2021-04-11 10:31:07 +08:00
xiarixiaoyao
8d4a7fe33e
[HUDI-1783] Support Huawei Cloud Object Storage ( #2796 )
2021-04-10 13:02:11 +08:00