Go to file

vinothchandar cf7f7aabb9 Nicer handling of timeline archival for Cloud storage

- When append() is not supported, rollover to new file always (instead of failing)
 - Provide way to configure archive log folder (avoids small files inside .hoodie)
 - Datasets written via Spark datasource archive to .hoodie/archived
 - HoodieClientExample will now retain only 2,3 commits to exercise archival path during dev cycles
 - Few tweaks to code structure around CommitArchiveLog

2018-01-17 23:34:21 -08:00

deploy

Add ossrh profile to publish maven artifacts to oss.sonatype.org (synced with maven central)

2016-12-21 14:17:35 -08:00

docs

Update docs on code style setup

2017-11-12 23:19:02 -08:00

hoodie-cli

Multi FS Support

2018-01-17 23:34:21 -08:00

hoodie-client

Nicer handling of timeline archival for Cloud storage

2018-01-17 23:34:21 -08:00

hoodie-common

Nicer handling of timeline archival for Cloud storage

2018-01-17 23:34:21 -08:00

hoodie-hadoop-mr

Nicer handling of timeline archival for Cloud storage

2018-01-17 23:34:21 -08:00

hoodie-hive

Multi FS Support

2018-01-17 23:34:21 -08:00

hoodie-spark

Nicer handling of timeline archival for Cloud storage

2018-01-17 23:34:21 -08:00

hoodie-utilities

Multi FS Support

2018-01-17 23:34:21 -08:00

_config.yml

Set theme jekyll-theme-minimal

2016-12-29 16:53:39 -08:00

.gitignore

Importing Hoodie Client from internal repo

2016-12-16 14:34:42 -08:00

.travis.yml

Update java version to 8 in travis.yml

2017-05-17 13:43:11 -07:00

CHANGELOG.md

Added CHANGELOG.md and updated community contributions guideline

2017-06-16 10:48:37 -07:00

LICENSE.txt

Importing Hoodie Client from internal repo

2016-12-16 14:34:42 -08:00

pom.xml

Reformatting code per Google Code Style all over

2017-11-12 23:19:02 -08:00

README.md

Update README.md

2017-12-10 07:50:37 -08:00

RELEASE_NOTES.md

Release notes for 0.4.0

2017-10-02 22:26:22 -07:00

README.md

Hudi

Hudi (pronounced Hoodie) stands for Hadoop Upserts anD Incrementals. Hudi manages storage of large analytical datasets on HDFS and serve them out via two types of tables

Read Optimized Table - Provides excellent query performance via purely columnar storage (e.g. Parquet)
Near-Real time Table (WIP) - Provides queries on real-time data, using a combination of columnar & row based storage (e.g Parquet + Avro)

For more, head over here

Languages

Java 81.4%

Scala 16.7%

ANTLR 0.9%

Shell 0.8%

Dockerfile 0.2%