Go to file

Vinoth Chandar d58ddbd999 Reworking the deltastreamer tool

- Standardize version of jackson
 - DFSPropertiesConfiguration replaces usage of commons PropertiesConfiguration
 - Remove dependency on ConstructorUtils
 - Throw error if ordering value is not present, during key generation
 - Switch to shade plugin for hoodie-utilities
 - Added support for consumption for Confluent avro kafka serdes
 - Support for Confluent schema registry
 - KafkaSource now deals with skews nicely, by doing round robin allocation of source limit across partitions
 - Added support for BULK_INSERT operations as well
 - Pass in the payload class config properly into HoodieWriteClient
 - Fix documentation based on new usage
 - Adding tests on deltastreamer, sources and all new util classes.

2018-09-08 10:24:32 +08:00

deploy

Add ossrh profile to publish maven artifacts to oss.sonatype.org (synced with maven central)

2016-12-21 14:17:35 -08:00

docs

Reworking the deltastreamer tool

2018-09-08 10:24:32 +08:00

hoodie-cli

Reworking the deltastreamer tool

2018-09-08 10:24:32 +08:00

hoodie-client

Reworking the deltastreamer tool

2018-09-08 10:24:32 +08:00

hoodie-common

Reworking the deltastreamer tool

2018-09-08 10:24:32 +08:00

hoodie-hadoop-mr

[maven-release-plugin] prepare for next development iteration

2018-08-22 21:30:05 -07:00

hoodie-hive

1. Small file size handling for inserts into log files. In summary, the total size of the log file is compared with the parquet max file size and if there is scope to add inserts the add it.

2018-09-06 08:52:08 +08:00

hoodie-spark

Reworking the deltastreamer tool

2018-09-08 10:24:32 +08:00

hoodie-utilities

Reworking the deltastreamer tool

2018-09-08 10:24:32 +08:00

style

CodeStyle formatting to conform to basic Checkstyle rules.

2018-03-30 11:09:40 -07:00

_config.yml

Set theme jekyll-theme-minimal

2016-12-29 16:53:39 -08:00

.gitignore

Importing Hoodie Client from internal repo

2016-12-16 14:34:42 -08:00

.travis.yml

Update java version to 8 in travis.yml

2017-05-17 13:43:11 -07:00

CHANGELOG.md

Added CHANGELOG.md and updated community contributions guideline

2017-06-16 10:48:37 -07:00

LICENSE.txt

Importing Hoodie Client from internal repo

2016-12-16 14:34:42 -08:00

pom.xml

Reworking the deltastreamer tool

2018-09-08 10:24:32 +08:00

README.md

Update README.md

2017-12-10 07:50:37 -08:00

RELEASE_NOTES.md

Update Release notes for 0.4.3 release

2018-08-22 21:11:43 -07:00

README.md

Hudi

Hudi (pronounced Hoodie) stands for Hadoop Upserts anD Incrementals. Hudi manages storage of large analytical datasets on HDFS and serve them out via two types of tables

Read Optimized Table - Provides excellent query performance via purely columnar storage (e.g. Parquet)
Near-Real time Table (WIP) - Provides queries on real-time data, using a combination of columnar & row based storage (e.g Parquet + Avro)

For more, head over here

Languages

Java 81.4%

Scala 16.7%

ANTLR 0.9%

Shell 0.8%

Dockerfile 0.2%