Go to file

Vinoth Chandar 0cd186c899 Multi FS Support

- Reviving PR 191, to make FileSystem creation off actual path
 - Streamline all filesystem access to HoodieTableMetaClient
 - Hadoop Conf from Spark Context serialized & passed to executor code too
 - Pick up env vars prefixed with HOODIE_ENV_ into Configuration object
 - Cleanup usage of FSUtils.getFS, piggybacking off HoodieTableMetaClient.getFS
 - Adding s3a to supported schemes & support escaping "." in env vars
 - Tests use HoodieTestUtils.getDefaultHadoopConf

2018-01-17 23:34:21 -08:00

deploy

Add ossrh profile to publish maven artifacts to oss.sonatype.org (synced with maven central)

2016-12-21 14:17:35 -08:00

docs

Update docs on code style setup

2017-11-12 23:19:02 -08:00

hoodie-cli

Multi FS Support

2018-01-17 23:34:21 -08:00

hoodie-client

Multi FS Support

2018-01-17 23:34:21 -08:00

hoodie-common

Multi FS Support

2018-01-17 23:34:21 -08:00

hoodie-hadoop-mr

Multi FS Support

2018-01-17 23:34:21 -08:00

hoodie-hive

Multi FS Support

2018-01-17 23:34:21 -08:00

hoodie-spark

Multi FS Support

2018-01-17 23:34:21 -08:00

hoodie-utilities

Multi FS Support

2018-01-17 23:34:21 -08:00

_config.yml

Set theme jekyll-theme-minimal

2016-12-29 16:53:39 -08:00

.gitignore

Importing Hoodie Client from internal repo

2016-12-16 14:34:42 -08:00

.travis.yml

Update java version to 8 in travis.yml

2017-05-17 13:43:11 -07:00

CHANGELOG.md

Added CHANGELOG.md and updated community contributions guideline

2017-06-16 10:48:37 -07:00

LICENSE.txt

Importing Hoodie Client from internal repo

2016-12-16 14:34:42 -08:00

pom.xml

Reformatting code per Google Code Style all over

2017-11-12 23:19:02 -08:00

README.md

Update README.md

2017-12-10 07:50:37 -08:00

RELEASE_NOTES.md

Release notes for 0.4.0

2017-10-02 22:26:22 -07:00

README.md

Hudi

Hudi (pronounced Hoodie) stands for Hadoop Upserts anD Incrementals. Hudi manages storage of large analytical datasets on HDFS and serve them out via two types of tables

Read Optimized Table - Provides excellent query performance via purely columnar storage (e.g. Parquet)
Near-Real time Table (WIP) - Provides queries on real-time data, using a combination of columnar & row based storage (e.g Parquet + Avro)

For more, head over here

Languages

Java 81.4%

Scala 16.7%

ANTLR 0.9%

Shell 0.8%

Dockerfile 0.2%