li36909
2c5a661a64
[HUDI-1759] Save one connection retry to hive metastore when hiveSyncTool run with useJdbc=false ( #2759 )
...
* [HUDI-1759] Save one connection retry to hive metastore when hiveSyncTool run with useJdbc=false
* Fix review comment
2021-05-07 15:30:26 -07:00
Danny Chan
528f4ca988
[HUDI-1880] Support streaming read with compaction and cleaning ( #2921 )
2021-05-07 20:04:35 +08:00
hiscat
0a5863939b
[HUDI-1821] Remove legacy code for Flink writer ( #2868 )
2021-05-07 10:58:49 +08:00
Sivabalan Narayanan
0284cdecce
[HUDI-1876] wiring in Hadoop Conf with AvroSchemaConverters instantiation ( #2914 )
2021-05-05 21:31:44 -07:00
xiarixiaoyao
1db904a12e
[HUDI-1718] When query incr view of mor table which has Multi level partitions, the query failed ( #2716 )
2021-05-05 00:34:20 -04:00
dijie
c5220b96e9
[HUDI-1781] Fix Flink streaming reader throws ClassCastException ( #2900 )
2021-05-01 19:13:15 +08:00
Nick Young
ea14d687da
[HUDI-1852] Add SCHEMA_REGISTRY_SOURCE_URL_SUFFIX and SCHEMA_REGISTRY_TARGET_URL_SUFFIX property ( #2884 )
2021-05-01 10:02:00 +08:00
Raymond Xu
3418a92de8
[HUDI-1620] Fix Metrics UT ( #2894 )
...
Make sure shutdown Metrics between unit test cases to ensure isolation
2021-04-30 11:20:41 -07:00
Raymond Xu
46de9e0f3f
[HUDI-1810] Fix azure setting for integ tests ( #2889 )
2021-04-30 11:17:36 -07:00
Raymond Xu
faf3785a2d
[HUDI-1811] Fix TestHoodieRealtimeRecordReader ( #2873 )
...
Pass basePath with scheme 'file://' to HoodieRealtimeFileSplit
2021-04-30 11:16:55 -07:00
xiarixiaoyao
929eca43fe
[HUDI-1817] Fix getting incorrect partition path while using incr query by spark-sql ( #2858 )
2021-04-30 14:57:52 +08:00
Danny Chan
6848a683bd
[HUDI-1867] Streaming read for Flink COW table ( #2895 )
...
Supports streaming read for Copy On Write table.
2021-04-29 20:44:45 +08:00
Danny Chan
6e9c5dd765
[HUDI-1863] Add rate limiter to Flink writer to avoid OOM for bootstrap ( #2891 )
2021-04-29 20:32:10 +08:00
pengzhiwei
c9bcb5e33f
[HUDI-1845] Exception Throws When Sync Non-Partitioned Table To Hive With MultiPartKeysValueExtractor ( #2876 )
2021-04-28 19:11:46 -07:00
dijie
3ca9030256
[HUDI-1858] Fix cannot create table due to jar conflict ( #2886 )
...
Co-authored-by: 狄杰 <shenjinxin@accesscorporate.com.cn >
2021-04-28 14:10:04 +08:00
satishkotha
386767693d
[HUDI-1833] rollback pending clustering even if there is greater commit ( #2863 )
...
* [HUDI-1833] rollback pending clustering even if there are greater commits
2021-04-27 14:21:42 -07:00
Roc Marshal
e4fd195d9f
[MINOR] Refactor method up to parent-class ( #2822 )
2021-04-27 21:32:32 +08:00
satishkotha
2999586509
[HUDI-1690] use jsc union instead of rdd union ( #2872 )
2021-04-26 23:35:01 -07:00
hiscat
63fa2b6186
[HUDI-1836] Logging consuming instant to StreamReadOperator#processSplits ( #2867 )
2021-04-27 14:00:59 +08:00
Danny Chan
5be3997f70
[HUDI-1841] Tweak the min max commits to keep when setting up cleaning retain commits for Flink ( #2875 )
2021-04-27 10:58:06 +08:00
Roc Marshal
9bbb458e88
[MINOR] Remove redundant method-calling. ( #2881 )
2021-04-27 09:34:09 +08:00
Nick Young
f4e3b94971
[HUDI-1742] Improve table level config priority for HoodieMultiTableDeltaStreamer ( #2744 )
2021-04-26 22:05:06 +08:00
Danny Chan
d047e91d86
[HUDI-1837] Add optional instant range to log record scanner for log ( #2870 )
2021-04-26 16:53:18 +08:00
Sivabalan Narayanan
3e4fa170cf
[HUDI-1835] Fixing kafka native config param for auto offset reset ( #2864 )
2021-04-25 12:16:09 -04:00
Danny Chan
1b27259b53
[HUDI-1844] Add option to flush when total buckets memory exceeds the threshold ( #2877 )
...
Current code supports flushing as per-bucket memory usage, while the
buckets may still take too much memory for bootstrap from history data.
When the threshold hits, flush out half of the buckets with bigger
buffer size.
2021-04-25 23:06:53 +08:00
Danny Chan
a5789c4067
[HUDI-1829] Use while loop instead of recursive call in MergeOnReadInputFormat to avoid StackOverflow ( #2862 )
...
Recursive all is risky for StackOverflow when there are too many.
2021-04-23 09:59:36 +08:00
Chanh Le
a1e636dc6b
[HUDI-1551] Add support for BigDecimal and Integer when partitioning based on time. ( #2851 )
...
Co-authored-by: trungchanh.le <trungchanh.le@bybit.com >
2021-04-22 21:56:20 +08:00
jsbali
4a3431866d
[HUDI-1746] Added support for replace commits in commit showpartitions, commit show_write_stats, commit showfiles ( #2678 )
...
* Added support for replace commits in commit showpartitions, commit show_write_stats, commit showfiles
* Adding CR changes
* [HUDI-1746] Code review changes
2021-04-21 10:31:35 -07:00
jsbali
b31c520c66
[HUDI-1714] Added tests to TestHoodieTimelineArchiveLog for the archival of compl… ( #2677 )
...
* Added tests to TestHoodieTimelineArchiveLog for the archival of completed clean and rollback actions.
* Adding code review changes
* [HUDI-1714] Minor Fixes
2021-04-21 10:27:43 -07:00
vinoyang
c24d90d25a
[MINOR] Expose the detailed exception object ( #2861 )
2021-04-21 22:41:42 +08:00
hiscat
cc81ddde01
[HUDI-1812] Add explicit index state TTL option for Flink writer ( #2853 )
2021-04-21 20:13:30 +08:00
Danny Chan
ac3589f006
[HUDI-1814] Non partitioned table for Flink writer ( #2859 )
2021-04-21 20:07:27 +08:00
pengzhiwei
aacb8be521
[HUDI-1415] Read Hoodie Table As Spark DataSource Table ( #2283 )
2021-04-20 14:21:38 -07:00
Jintao Guan
3253079507
[HUDI-1764] Add Hudi-CLI support for clustering ( #2773 )
...
* tmp base
* update
* update unit test
* update
* update
* update CLI parameters
* linting
* update doSchedule in HoodieClusteringJob
* update
* update diff according to comments
2021-04-20 09:46:42 -07:00
Danny Chan
d6d52c6063
[HUDI-1809] Flink merge on read input split uses wrong base file path for default merge type ( #2846 )
2021-04-20 21:27:09 +08:00
Sebastian Bernauer
9a288ccbeb
[MINOR] Added metric reporter Prometheus to HoodieBackedTableMetadataWriter ( #2842 )
2021-04-19 16:04:59 -07:00
li36909
6b4b878d08
[HUDI-1744] rollback fails on mor table when the partition path hasn't any files ( #2749 )
...
Co-authored-by: lrz <lrz@lrzdeMacBook-Pro.local >
2021-04-19 15:44:11 -07:00
Thinking Chen
d21753d903
[HUDI-1802] Timeline Server Bundle need to include com.esotericsoftware package ( #2835 )
2021-04-19 09:27:58 -07:00
Aditya Tiwari
ec2334ceac
[HUDI-1716]: Resolving default values for schema from dataframe ( #2765 )
...
- Adding default values and setting null as first entry in UNION data types in avro schema.
Co-authored-by: Aditya Tiwari <aditya.tiwari@flipkart.com >
2021-04-19 10:05:20 -04:00
Danny Chan
dab5114f16
[HUDI-1804] Continue to write when Flink write task restart because of container killing ( #2843 )
...
The `FlinkMergeHande` creates a marker file under the metadata path
each time it initializes, when a write task restarts from killing, it
tries to create the existing file and reports error.
To solve this problem, skip the creation and use the original data file
as base file to merge.
2021-04-19 19:43:41 +08:00
Roc Marshal
f7b6b68063
[MINOR][hudi-sync] Fix typos ( #2844 )
2021-04-19 16:27:13 +08:00
satishkotha
4e050cc2ba
[MINOR] Add jackson module to presto bundle ( #2816 )
2021-04-17 13:26:07 -07:00
Xu Guang Lv
1d53d6e6c2
[HUDI-1803] Support BAIDU AFS storage format in hudi ( #2836 )
2021-04-16 16:43:14 +08:00
hj2016
62b8a341dd
[HUDI-1792] flink-client query error when processing files larger than 128mb ( #2814 )
...
Co-authored-by: huangjing <huangjing@clinbrain.com >
2021-04-16 13:59:19 +08:00
Danny Chan
b6d949b48a
[HUDI-1801] FlinkMergeHandle rolling over may miss to rename the latest file handle ( #2831 )
...
The FlinkMergeHandle may rename the N-1 th file handle instead of the
latest one, thus to cause data duplication.
2021-04-16 11:40:53 +08:00
MINCWANG
191470d1fc
[HUDI-1797] Remove the com.google.guave jar from hudi-flink-bundle to avoid conflicts. ( #2828 )
...
Co-authored-by: wangminchao <wangminchao@asinking.com >
2021-04-15 15:16:33 +08:00
hiscat
6d1aec604f
[HUDI-1798] Flink streaming reader should always monitor the delta commits files ( #2825 )
...
The streaming reader should only monitor the delta log files, if there are parquet commits but we recognize as logs, the reader would report FileNotFound exception.
2021-04-15 13:50:17 +08:00
Roc Marshal
62bb9e10d9
[Hotfix][utilities] Optimized codes ( #2821 )
2021-04-15 09:40:14 +08:00
Sivabalan Narayanan
8d29863c86
[HUDI-1615] Fixing usage of NULL schema for delete operation in HoodieSparkSqlWriter ( #2777 )
2021-04-14 15:35:39 +08:00
Danny Chan
ab4a7b0b4a
[HUDI-1788] Insert overwrite (table) for Flink writer ( #2808 )
...
Supports `INSERT OVERWRITE` and `INSERT OVERWRITE TABLE` for Flink
writer.
2021-04-14 10:23:37 +08:00