Go to file

vinothchandar 57a8b9cc8c Making DataSource/DeltaStreamer use defaults for combining

- Addresses issue where insert will combine and remove duplicates within batch
 - Setting default insert combining to false (write client default)
 - Set to true if filtering duplicates on insert/bulk_insert

2019-05-01 13:21:21 -07:00

deploy

Add ossrh profile to publish maven artifacts to oss.sonatype.org (synced with maven central)

2016-12-21 14:17:35 -08:00

docker

Move to apachehudi dockerhub repository & use openjdk docker containers

2019-04-17 16:37:58 -07:00

hoodie-cli

FileSystem View must treat same fileIds present in different partitions as different file-groups and handle pending compaction correctly

2019-03-01 10:49:04 -08:00

hoodie-client

Introduce config to control interval tree pruning

2019-04-29 11:38:23 -07:00

hoodie-common

Removing OLD MAGIC header since a) it's no longer used b) causes issues when the data actually has OLD MAGIC

2019-04-25 20:47:16 -07:00

hoodie-hadoop-mr

[maven-release-plugin] prepare for next development iteration

2019-02-27 07:16:27 -08:00

hoodie-hive

run_hive_sync tool must be able to handle case where there are multiple standalone jdbc jars in hive installation dir

2019-03-21 09:58:20 -07:00

hoodie-integ-test

[maven-release-plugin] prepare for next development iteration

2019-02-27 07:16:27 -08:00

hoodie-spark

Making DataSource/DeltaStreamer use defaults for combining

2019-05-01 13:21:21 -07:00

hoodie-utilities

Making DataSource/DeltaStreamer use defaults for combining

2019-05-01 13:21:21 -07:00

packaging

Fix Hive RT query failure in hoodie demo

2019-04-17 16:36:32 -07:00

style

General enhancements

2018-12-18 12:52:39 -08:00

_config.yml

Set theme jekyll-theme-minimal

2016-12-29 16:53:39 -08:00

.gitignore

Importing Hoodie Client from internal repo

2016-12-16 14:34:42 -08:00

.travis.yml

Add m2 directory to Travis cache

2018-12-31 10:31:12 -08:00

CHANGELOG.md

Added CHANGELOG.md and updated community contributions guideline

2017-06-16 10:48:37 -07:00

KEYS

HUDI-75: Add KEYS

2019-03-18 07:46:25 -07:00

LICENSE.txt

Importing Hoodie Client from internal repo

2016-12-16 14:34:42 -08:00

pom.xml

Fix hive sync (libfb version mismatch) and deltastreamer issue (missing cmdline argument) in demo

2019-03-13 16:14:32 -07:00

README.md

Update site url in README

2019-02-15 21:28:39 -08:00

RELEASE_NOTES.md

Update RELEASE_NOTES for 0.4.5

2019-02-27 06:47:56 -08:00

README.md

Hudi

Hudi (pronounced Hoodie) stands for Hadoop Upserts anD Incrementals. Hudi manages storage of large analytical datasets on HDFS and serve them out via two types of tables

Read Optimized Table - Provides excellent query performance via purely columnar storage (e.g. Parquet)
Near-Real time Table (WIP) - Provides queries on real-time data, using a combination of columnar & row based storage (e.g Parquet + Avro)

For more, head over here

Languages

Java 81.4%

Scala 16.7%

ANTLR 0.9%

Shell 0.8%

Dockerfile 0.2%