1
0
Commit Graph

1471 Commits

Author SHA1 Message Date
satishkotha
4e050cc2ba [MINOR] Add jackson module to presto bundle (#2816) 2021-04-17 13:26:07 -07:00
Xu Guang Lv
1d53d6e6c2 [HUDI-1803] Support BAIDU AFS storage format in hudi (#2836) 2021-04-16 16:43:14 +08:00
hj2016
62b8a341dd [HUDI-1792] flink-client query error when processing files larger than 128mb (#2814)
Co-authored-by: huangjing <huangjing@clinbrain.com>
2021-04-16 13:59:19 +08:00
Danny Chan
b6d949b48a [HUDI-1801] FlinkMergeHandle rolling over may miss to rename the latest file handle (#2831)
The FlinkMergeHandle may rename the N-1 th file handle instead of the
latest one, thus to cause data duplication.
2021-04-16 11:40:53 +08:00
MINCWANG
191470d1fc [HUDI-1797] Remove the com.google.guave jar from hudi-flink-bundle to avoid conflicts. (#2828)
Co-authored-by: wangminchao <wangminchao@asinking.com>
2021-04-15 15:16:33 +08:00
hiscat
6d1aec604f [HUDI-1798] Flink streaming reader should always monitor the delta commits files (#2825)
The streaming reader should only monitor the delta log files, if there are parquet commits but we recognize as logs, the reader would report FileNotFound exception.
2021-04-15 13:50:17 +08:00
Roc Marshal
62bb9e10d9 [Hotfix][utilities] Optimized codes (#2821) 2021-04-15 09:40:14 +08:00
Sivabalan Narayanan
8d29863c86 [HUDI-1615] Fixing usage of NULL schema for delete operation in HoodieSparkSqlWriter (#2777) 2021-04-14 15:35:39 +08:00
Danny Chan
ab4a7b0b4a [HUDI-1788] Insert overwrite (table) for Flink writer (#2808)
Supports `INSERT OVERWRITE` and `INSERT OVERWRITE TABLE` for Flink
writer.
2021-04-14 10:23:37 +08:00
xiarixiaoyao
65844a8d29 [HUDI-1720] Fix RealtimeCompactedRecordReader StackOverflowError (#2721) 2021-04-13 18:23:26 +08:00
hiscat
e16d31dce2 [HUDI-1787] Remove the rocksdb jar from hudi-flink-bundle (#2807)
Remove the RocksDB jar from hudi-flink-bundle to avoid conflicts.
2021-04-13 10:31:16 +08:00
Danny Chan
1ff99ca7d7 [HUDI-1786] Add option for merge max memory (#2805) 2021-04-12 17:03:58 +08:00
wangxianghu
040756d8c0 [HUDI-1785] Move OperationConverter to hudi-client-common for code reuse (#2798) 2021-04-12 16:22:33 +08:00
hj2016
1da16dfd2e [HUDI-1784] Added print detailed stack log when hbase connection error (#2799) 2021-04-12 13:46:06 +08:00
wangxianghu
f3777f44fe [MINOR] Remove unused imports and some other checkstyle issues (#2800) 2021-04-11 21:42:34 +08:00
Roc Marshal
b554835053 [MINOR] fix typo. (#2804) 2021-04-11 10:31:07 +08:00
xiarixiaoyao
8d4a7fe33e [HUDI-1783] Support Huawei Cloud Object Storage (#2796) 2021-04-10 13:02:11 +08:00
Danny Chan
6786581c48 [HUDI-1775] Add option for compaction parallelism (#2785) 2021-04-09 13:46:19 +08:00
Vinoth Govindarajan
08e82c469c [HUDI-1762] Added HiveStylePartitionExtractor to support Hive style partitions (#2769) 2021-04-09 01:00:11 -04:00
Gary Li
cf3d2e21eb [MINOR] Update doap with 0.8.0 release (#2772) 2021-04-08 11:06:13 -04:00
hiscat
5b3608f149 [HUDI-1778] Add setter to CompactionPlanEvent and CompactionCommitEvent to have better SE/DE performance for Flink (#2789) 2021-04-08 19:40:37 +08:00
hongdd
ecdbd2517f [HUDI-699] Fix CompactionCommand and add unit test for CompactionCommand (#2325) 2021-04-08 15:35:33 +08:00
Simon
18459d4045 [MINOR] Some unit test code optimize (#2782)
* Optimized code

* Optimized code
2021-04-08 13:35:03 +08:00
hiscat
3a926aacf6 [HUDI-1773] HoodieFileGroup code optimize (#2781) 2021-04-07 18:16:03 +08:00
hiscat
f4f9dd9d83 [HUDI-1772] HoodieFileGroupId compareTo logical error(fileId self compare) (#2780) 2021-04-07 18:10:38 +08:00
li36909
dadd081d45 [HUDI-1751] DeltaStreamer print many unnecessary warn log (#2754) 2021-04-07 00:47:03 -07:00
hiscat
d035fcbb3c [HUDI-1767] Add setter to HoodieKey and HoodieRecordLocation to have better SE/DE performance for Flink (#2779) 2021-04-07 14:13:31 +08:00
li36909
8527590772 [HUDI-1750] Fail to load user's class if user move hudi-spark-bundle jar into spark classpath (#2753) 2021-04-06 22:33:32 -04:00
Harshit Mittal
e692c704da [MINOR] Fix deprecated build link for travis (#2778) 2021-04-07 08:57:10 +08:00
Danny Chan
9c369c607d [HUDI-1757] Assigns the buckets by record key for Flink writer (#2757)
Currently we assign the buckets by record partition path which could
cause hotspot if the partition field is datetime type. Changes to assign
buckets by grouping the record whth their key first, the assignment is
valid if only there is no conflict(two task write to the same bucket).

This patch also changes the coordinator execution to be asynchronous.
2021-04-06 19:06:41 +08:00
li36909
920537cac8 [HUDI-1749] Clean/Compaction/Rollback command maybe never exit when operation fail (#2752) 2021-04-05 23:23:15 -07:00
Harshit Mittal
e970e1f483 [HUDI-1696] add apache commons-codec dependency to flink-bundle explicitly (#2758) 2021-04-01 23:07:30 -07:00
Roc Marshal
94a5e72f16 [HUDI-1737][hudi-client] Code Cleanup: Extract common method in HoodieCreateHandle & FlinkCreateHandle (#2745) 2021-04-02 11:39:05 +08:00
pengzhiwei
684622c7c9 [HUDI-1591] Implement Spark's FileIndex for Hudi to support queries via Hudi DataSource using non-globbed table path and partition pruning (#2651) 2021-04-01 11:12:28 -07:00
Danny Chan
9804662bc8 [HUDI-1738] Emit deletes for flink MOR table streaming read (#2742)
Current we did a soft delete for DELETE row data when writes into hoodie
table. For streaming read of MOR table, the Flink reader detects the
delete records and still emit them if the record key semantics are still
kept.

This is useful and actually a must for streaming ETL pipeline
incremental computation.
2021-04-01 15:25:31 +08:00
vinoyang
fe16d0de7c [MINOR] Delete useless UpsertPartitioner for flink integration (#2746) 2021-03-31 16:36:42 +08:00
Sebastian Bernauer
aa0da72c59 Preparation for Avro update (#2650) 2021-03-30 21:50:17 -07:00
leo-Iamok
8bc65b9318 [HUDI-1731] Rename UpsertPartitioner in hudi-java-client (#2734)
Co-authored-by: lei.zhu <lei.zhu@envisioncn.com>
2021-03-31 11:06:04 +08:00
vinoyang
3cab928b50 [HUDI-1735] Add hive-exec dependency for hudi-examples (#2737) 2021-03-30 21:35:16 +08:00
Gary Li
050626ad6c [MINOR] Add Missing Apache License to test files (#2736) 2021-03-29 07:17:23 -07:00
garyli1019
e069b64e10 [HOTFIX] fix deploy staging jars script 2021-03-29 06:04:48 -07:00
Gary Li
4db970dc8a [HOTFIX] Disable ITs for Spark3 and scala2.12 (#2733) 2021-03-29 06:04:48 -07:00
Gary Li
452f5e2d66 [HOTFIX] close spark session in functional test suite and disable spark3 test for spark2 (#2727) 2021-03-29 06:04:48 -07:00
Danny Chan
d415d45416 [HUDI-1729] Asynchronous Hive sync and commits cleaning for Flink writer (#2732) 2021-03-29 10:47:29 +08:00
Shen Hong
ecbd389a3f [HUDI-1478] Introduce HoodieBloomIndex to hudi-java-client (#2608) 2021-03-28 20:28:40 +08:00
n3nash
bec70413c0 [HUDI-1728] Fix MethodNotFound for HiveMetastore Locks (#2731) 2021-03-27 10:07:10 -07:00
Danny Chan
8b774fe331 [HUDI-1495] Bump Flink version to 1.12.2 (#2718) 2021-03-26 14:25:57 +08:00
garyli1019
6e803e08b1 Moving to 0.9.0-SNAPSHOT on master branch. 2021-03-24 21:37:14 +08:00
Danny Chan
29b79c99b0 [hotfix] Log the error message for creating table source first (#2711) 2021-03-24 18:25:37 +08:00
n3nash
01a1d7997b [HUDI-1712] Rename & standardize config to match other configs (#2708) 2021-03-24 17:24:02 +08:00