Files

leesf 5ce45c440b [HUDI-3172] Refactor hudi existing modules to make more code reuse in V2 Implementation (#4514 )

* Introduce hudi-spark3-common and hudi-spark2-common modules to place classes that would be reused in different spark versions, also introduce hudi-spark3.1.x to support spark 3.1.x.
* Introduce hudi format under hudi-spark2, hudi-spark3, hudi-spark3.1.x modules and change the hudi format in original hudi-spark module to hudi_v1 format.
* Manually tested on Spark 3.1.2 and Spark 3.2.0 SQL.
* Added a README.md file under hudi-spark-datasource module.

2022-01-14 13:42:35 +08:00

hudi-flink-bundle

[HUDI-2983] Remove Log4j2 transitive dependencies (#4281 )

2021-12-28 07:15:05 -08:00

hudi-hadoop-mr-bundle

Moving to 0.11.0-SNAPSHOT on master branch.

2021-11-27 17:22:10 +08:00

hudi-hive-sync-bundle

Moving to 0.11.0-SNAPSHOT on master branch.

2021-11-27 17:22:10 +08:00

hudi-integ-test-bundle

[HUDI-3172] Refactor hudi existing modules to make more code reuse in V2 Implementation (#4514 )

2022-01-14 13:42:35 +08:00

hudi-kafka-connect-bundle

Moving to 0.11.0-SNAPSHOT on master branch.

2021-11-27 17:22:10 +08:00

hudi-presto-bundle

[HUDI-3010] Unbundle parquet-avro and shade other dependencies in prsto bundle (#4551 )

2022-01-12 20:00:24 -08:00

hudi-spark-bundle

[HUDI-3172] Refactor hudi existing modules to make more code reuse in V2 Implementation (#4514 )

2022-01-14 13:42:35 +08:00

hudi-timeline-server-bundle

[MINOR] Use maven-shade-plugin version for hudi-timeline-server-bundle from main pom.xml (#4209 )

2021-12-06 12:29:18 -08:00

hudi-trino-bundle

[HUDI-2784] Add a hudi-trino-bundle for Trino (#4279 )

2021-12-10 14:27:22 -08:00

hudi-utilities-bundle

[HUDI-3172] Refactor hudi existing modules to make more code reuse in V2 Implementation (#4514 )

2022-01-14 13:42:35 +08:00

README.md

HUDI-121 : Address comments during RC2 voting

2019-09-30 15:42:15 -07:00

README.md

Overview

This folder contains several modules that build out bundles (i.e fat/uber jars) that enable hudi integration into various systems.

Here are the key principles applied in designing these bundles

As much as possible, try to make the bundle work with the target system's jars and classes. (e.g: better to make Hudi work with Hive's parquet version than bundling parquet with Hudi). This lets us evolve Hudi as a lighter weight component and also provides flexibility for changing these jar versions in target systems
Bundle's pom only needs to depend on the required hudi modules & any other modules that are declared "provided" in parent poms (e.g: parquet-avro).
Such other modules should be declared as "compile" dependency in the bundle pom to actually get the shade plugin in pull them into the bundle. By default, provided scoped dependencies are not included
Any other runtime dependencies needed by the bundle should specified in the <include> whitelist. New bundles also should follow the same style of explicitly whitelisting modules and shading as needed.
Leave abundant comments on why someone is being included, shaded or even being left out.

Please follow these when adding new ones or making changes.

Resources

Classes needed for Hive2 JDBC documented here