1
0
Commit Graph

164 Commits

Author SHA1 Message Date
wangxianghu
003c6ee73e [MINODR] Remove repeated kafka-clients dependencies (#5034) 2022-03-14 18:24:06 +04:00
从大数据到人工智能
01cbddef78 Add hive-standalone-metastore dependency to hudi-flink-bundle module (#4870) 2022-02-23 09:16:21 +08:00
Danny Chan
b87e95d621 [HUDI-3476] Remove the shade pattern for parquet for flink bundle jar (#4869) 2022-02-22 19:21:57 +08:00
Danny Chan
2844a77b43 [HUDI-3439] Remove the hive shade pattern for flink bundle jar (#4833) 2022-02-17 22:42:39 +08:00
Alexey Kudinkin
464027ec37 [HUDI-3239] Convert BaseHoodieTableFileIndex to Java (#4669)
Converting BaseHoodieTableFileIndex to Java, removing Scala as a dependency from "hudi-common"
2022-02-09 18:42:08 -05:00
Sivabalan Narayanan
e72553accf [HUDI-3262] Fixing utilities and integ test suite bundle to include hudi spark datasource (#4670) 2022-01-23 08:46:37 -05:00
Danny Chan
64b1426005 [minor] Fix hive-exec scope of flink bundle jar (#4664) 2022-01-23 10:28:41 +08:00
Alexey Kudinkin
4bea758738 [HUDI-3191] Rebasing Hive's FileInputFormat onto AbstractHoodieTableFileIndex (#4531) 2022-01-18 14:54:51 -08:00
EchoLee5
3b56320bd8 [HUDI-3261] Read rt table by hive cli throw NoSuchMethodError (#4624) 2022-01-18 16:58:08 +08:00
leesf
5ce45c440b [HUDI-3172] Refactor hudi existing modules to make more code reuse in V2 Implementation (#4514)
* Introduce hudi-spark3-common and hudi-spark2-common modules to place classes that would be reused in different spark versions, also introduce hudi-spark3.1.x to support spark 3.1.x.
* Introduce hudi format under hudi-spark2, hudi-spark3, hudi-spark3.1.x modules and change the hudi format in original hudi-spark module to hudi_v1 format.
* Manually tested on Spark 3.1.2 and Spark 3.2.0 SQL.
* Added a README.md file under hudi-spark-datasource module.
2022-01-14 13:42:35 +08:00
Sagar Sumit
209f91cb33 [HUDI-3010] Unbundle parquet-avro and shade other dependencies in prsto bundle (#4551) 2022-01-12 20:00:24 -08:00
RexAn
977d3c6dad [HUDI-3157] Remove aws jars from hudi bundles (#4542)
Co-authored-by: Hui An <hui.an@shopee.com>
2022-01-09 02:23:46 -08:00
Sagar Sumit
46bb00e4df [HUDI-3139] Shade htrace and parquet-avro in presto bundle (#4495)
Filter out unnecessary classes
2022-01-08 10:29:36 -05:00
Udit Mehrotra
9412281cb1 [HUDI-2983] Remove Log4j2 transitive dependencies (#4281) 2021-12-28 07:15:05 -08:00
Danny Chan
2dcb3f0062 [HUDI-2985] Shade jackson for hudi flink bundle jar (#4284) 2021-12-11 14:40:57 +08:00
Y Ethan Guo
72901a33a1 [HUDI-2784] Add a hudi-trino-bundle for Trino (#4279) 2021-12-10 14:27:22 -08:00
Danny Chan
bd08470421 [HUDI-2957] Shade kryo jar for flink bundle jar (#4251) 2021-12-09 10:16:42 +08:00
wenningd
4a437f25d3 [MINOR] Use maven-shade-plugin version for hudi-timeline-server-bundle from main pom.xml (#4209)
Co-authored-by: Wenning Ding <wenningd@amazon.com>
2021-12-06 12:29:18 -08:00
Sivabalan Narayanan
52aae36b53 [MINOR] Fixing integ test suite for hudi-aws and archival validation (#4142) 2021-11-28 20:11:50 -05:00
yuzhao.cyz
a1d0ff4209 Moving to 0.11.0-SNAPSHOT on master branch. 2021-11-27 17:22:10 +08:00
xiarixiaoyao
780a2ac5b2 [HUDI-2102] Support hilbert curve for hudi (#3952)
Co-authored-by: Y Ethan Guo <ethan.guoyihua@gmail.com>
2021-11-26 23:20:19 -08:00
rmahindra123
9028e6e1e4 [HUDI-2864] Fix README and scripts with current limitations of hive sync (#4129)
* Fix README with current limitations of hive sync

* Fix README with current limitations of hive sync

* Fix dep issue

* Fix Copy on Write flow

Co-authored-by: Rajesh Mahindra <rmahindra@Rajeshs-MacBook-Pro.local>
2021-11-26 15:09:32 -08:00
Danny Chan
f5da9b50fa [MINOR] Include hudi-aws in flink bundle jar (#4127)
HUDI-2801 makes this jar as required.
2021-11-26 14:36:44 +08:00
Ron
38585e4e57 [HUDI-2851] Shade org.apache.hadoop.hive.ql.optimizer package for flink bundle jar (#4104) 2021-11-26 11:27:21 +08:00
rmahindra123
7286b56d30 [HUDI-2853] Add JMX deps in hudi utilities and kafka connect bundles (#4108)
Co-authored-by: Rajesh Mahindra <rmahindra@Rajeshs-MacBook-Pro.local>
2021-11-24 19:03:01 -05:00
rmahindra123
fbff0799b9 [HUDI-2325] Add hive sync support to kafka connect (#3660)
Co-authored-by: Rajesh Mahindra <rmahindra@Rajeshs-MacBook-Pro.local>
2021-11-23 15:48:06 -08:00
zhangyue19921010
9ed28b1570 [HUDI-2409] Using HBase shaded jars in Hudi presto bundle (#3623)
* using hbase-shaded-jars-in-hudi-presto-hundle

* test

* add hudi-common-bundle

* code review

* code review

* code review

* code review

* test

* test

Co-authored-by: yuezhang <yuezhang@freewheel.tv>
2021-11-23 11:25:12 +05:30
Ron
6cc97cc0c9 Remove the aws packages from hudi flink bundle jar (#4050) 2021-11-20 11:55:12 +08:00
wenningd
1ee12cfa6f [HUDI-2314] Add support for DynamoDb based lock provider (#3486)
- Co-authored-by: Wenning Ding <wenningd@amazon.com>
- Co-authored-by: Sivabalan Narayanan <n.siva.b@gmail.com>
2021-11-17 12:09:31 -05:00
Danny Chan
689020f303 [HUDI-2684] Use DefaultHoodieRecordPayload when precombine field is specified specifically (#3922) 2021-11-04 16:23:36 +08:00
Alexey Kudinkin
b12a25b0b1 [MINOR] Fixed RAT config for "hudi-utilities-bundle" to ignore transient build-bound artifiacts (#3909) 2021-11-02 23:06:26 -04:00
vinoyang
13b637ddc3 [HUDI-2643] Remove duplicated hbase-common with tests classifier exists in bundles (#3886) 2021-11-01 20:11:00 +08:00
vinoyang
b1c4acf0ae [HUDI-2614] Remove duplicated hadoop-hdfs with tests classifier exists in bundles (#3864) 2021-10-26 22:36:10 +08:00
rmahindra123
3686c25fae [HUDI-2469] [Kafka Connect] Replace json based payload with protobuf for Transaction protocol. (#3694)
* Substitue Control Event with protobuf

* Fix tests

* Fix unit tests

* Add javadocs

* Add javadocs

* Address reviewer comments

Co-authored-by: Rajesh Mahindra <rmahindra@Rajeshs-MacBook-Pro.local>
2021-10-19 14:29:48 -07:00
Danny Chan
588a34aa95 [HUDI-2571] Remove include-flink-sql-connector-hive profile from flink bundle (#3818) 2021-10-18 17:34:49 +08:00
yiduwangkai
dfdfbbedae HUDI-2569 shaded hive (#3816)
Co-authored-by: wangkai9 <wangkai9@tuhu.cn>
2021-10-18 17:12:13 +08:00
yiduwangkai
5276850415 [HUDI-2557] Shade javax.servlet for flink bundle jar (#3807)
Co-authored-by: wangkai9 <wangkai9@tuhu.cn>
2021-10-18 11:26:21 +08:00
Danny Chan
ad63938890 [HUDI-2537] Fix metadata table for flink (#3774) 2021-10-10 09:30:39 +08:00
Sarah Witt
4deaa30c8d [HUDI-2404] Add metrics-jmx to spark and flink bundles (#3632) 2021-09-16 09:53:16 -04:00
rmahindra123
9735f4b8ef [HUDI-2428] Fix protocol and other issues after stress testing Hudi Kafka Connect (#3656)
* Fixes based on tests and some improvements
* Fix the issues after running stress tests
* Fixing checkstyle issues and updating README

Co-authored-by: Rajesh Mahindra <rmahindra@Rajeshs-MacBook-Pro.local>
Co-authored-by: Vinoth Chandar <vinoth@apache.org>
2021-09-14 07:14:58 -07:00
rmahindra123
e528dd798a [HUDI-2394] Implement Kafka Sink Protocol for Hudi for Ingesting Immutable Data (#3592)
- Fixing packaging, naming of classes
 - Use of log4j over slf4j for uniformity
- More follow-on fixes
 - Added a version to control/coordinator events.
 - Eliminated the config added to write config
 - Fixed fetching of checkpoints based on table type
 - Clean up of naming, code placement

Co-authored-by: Rajesh Mahindra <rmahindra@Rajeshs-MacBook-Pro.local>
Co-authored-by: Vinoth Chandar <vinoth@apache.org>
2021-09-10 18:20:26 -07:00
liujinhui
eb5e7eec0a MINOR_CHECKSTYLE (#3616)
Fix checkstyle
2021-09-07 18:19:39 +08:00
Udit Mehrotra
3e301196bf Moving to 0.10.0-SNAPSHOT on master branch. 2021-08-14 18:51:09 -07:00
pengzhiwei
0dcd6a8fca [HUDI-2233] Use HMS To Sync Hive Meta For Spark Sql (#3387) 2021-08-05 09:57:22 -04:00
pengzhiwei
151f22e43a [HUDI-2195] Sync Hive Failed When Execute CTAS In Spark2 And Spark3 (#3299) 2021-07-22 15:33:38 +08:00
swuferhong
047d956e01 [HUDI-2136] Fix conflict when flink-sql-connector-hive and hudi-flink-bundle are both in flink lib (#3227) 2021-07-09 10:10:21 +08:00
Randal Boyle
60e0254e67 [HUDI-1996] Adding functionality to allow the providing of basic auth creds for confluent cloud schema registry (#3097)
* adding support for basic auth with confluent cloud schema registry
2021-07-05 23:40:23 -07:00
swuferhong
0bd20827ab [HUDI-2133] Support hive1 metadata sync for flink writer (#3225) 2021-07-06 11:01:57 +08:00
pengzhiwei
f760ec543e [HUDI-1659] Basic Implement Of Spark Sql Support For Hoodie (#2645)
Main functions:
Support create table for hoodie.
Support CTAS.
Support Insert for hoodie. Including dynamic partition and static partition insert.
Support MergeInto for hoodie.
Support DELETE
Support UPDATE
Both support spark2 & spark3 based on DataSourceV1.

Main changes:
Add sql parser for spark2.
Add HoodieAnalysis for sql resolve and logical plan rewrite.
Add commands implementation for CREATE TABLE、INSERT、MERGE INTO & CTAS.
In order to push down the update&insert logical to the HoodieRecordPayload for MergeInto, I make same change to the
HoodieWriteHandler and other related classes.
1、Add the inputSchema for parser the incoming record. This is because the inputSchema for MergeInto is different from writeSchema as there are some transforms in the update& insert expression.
2、Add WRITE_SCHEMA to HoodieWriteConfig to pass the write schema for merge into.
3、Pass properties to HoodieRecordPayload#getInsertValue to pass the insert expression and table schema.


Verify this pull request
Add TestCreateTable for test create hoodie tables and CTAS.
Add TestInsertTable for test insert hoodie tables.
Add TestMergeIntoTable for test merge hoodie tables.
Add TestUpdateTable for test update hoodie tables.
Add TestDeleteTable for test delete hoodie tables.
Add TestSqlStatement for test supported ddl/dml currently.
2021-06-07 23:24:32 -07:00
vinoth chandar
d02c0e5387 [MINOR] Resolve build issue arising from inaccessible pentaho jar (#3034)
- Fixes #160 #2479
2021-06-04 15:28:44 -04:00