lanyuanxiaoyao/hudi

Go to file

Gurudatt Kulkarni 71ac2c0d5e [HUDI-324] TimestampKeyGenerator should support milliseconds (#993 )

2019-11-05 04:22:14 -08:00

[HUDI-312] Make docker hdfs cluster ephemeral. This is needed to fix flakiness in integration tests. Also, Fix DeltaStreamer hanging issue due to uncaught exception

2019-11-01 11:49:59 -07:00

[HUDI-130] Paths written in compaction plan needs to be relative to base-path

2019-10-23 02:52:24 -07:00

[HUDI-169] Speed up rolling back of instants (#968 )

2019-10-24 19:34:00 -07:00

[MINOR] Fix avro schema warnings in build

2019-10-31 21:49:38 -07:00

[HUDI-314] Fix multi partition keys error when querying a realtime table

2019-11-02 19:49:04 -07:00

[MINOR] fix annotation in teardown (#990 )

2019-10-31 07:59:35 -07:00

hudi-integ-test

Add MOR integration testing

2019-11-02 19:49:04 -07:00

[HUDI-290] Normalize test class name of all test classes (#951 )

2019-10-22 20:19:11 -07:00

hudi-timeline-service

[HUDI-290] Normalize test class name of all test classes (#951 )

2019-10-22 20:19:11 -07:00

[HUDI-324] TimestampKeyGenerator should support milliseconds (#993 )

2019-11-05 04:22:14 -08:00

[MINOR] Move all repository declarations to parent pom (#966 )

2019-10-22 20:17:13 -07:00

[HUDI-121] Fix issues in release scripts

2019-10-16 03:33:57 -07:00

[HUDI-121] Fix licensing issues found during RC voting by general incubator group

2019-10-16 02:09:02 -07:00

_config.yml

[HUDI-230] Add missing Apache License in some files

2019-08-30 09:38:28 -07:00

.gitignore

[HUDI-68] Pom cleanup & demo automation (#846 )

2019-08-22 20:18:50 -07:00

.travis.yml

[MINOR] Fix no output in travis (#984 )

2019-10-29 21:17:45 -07:00

DISCLAIMER-WIP

[HUDI-121] Fix licensing issues found during RC voting by general incubator group

2019-10-16 02:09:02 -07:00

LICENSE

[HUDI-121] Fix licensing issues found during RC voting by general incubator group

2019-10-16 02:09:02 -07:00

NOTICE

[MINOR] Add incubating to NOTICE and README.md

2019-10-09 21:42:29 -07:00

pom.xml

Bump httpclient from 4.3.2 to 4.3.6 (#980 )

2019-11-01 05:22:31 -07:00

README.md

[MINOR] Add features and instructions to build Hudi in README (#992 )

2019-11-03 01:48:06 -08:00

README.md

Hudi

Apache Hudi (Incubating) (pronounced Hoodie) stands for Hadoop Upserts anD Incrementals. Hudi manages the storage of large analytical datasets on DFS (Cloud stores, HDFS or any Hadoop FileSystem compatible storage).

Features

Upsert support with fast, pluggable indexing
Atomically publish data with rollback support
Snapshot isolation between writer & queries
Savepoints for data recovery
Manages file sizes, layout using statistics
Async compaction of row & columnar data
Timeline metadata to track lineage

Hudi provides the ability to query via three types of views:

Read Optimized View - Provides excellent snapshot query performance via purely columnar storage (e.g. Parquet)
Incremental View - Provides a change stream with records inserted or updated after a point in time.
Real-time View - Provides snapshot queries on real-time data, using a combination of columnar & row-based storage (e.g Parquet + Avro)

Learn more about Hudi at https://hudi.apache.org

Building Apache Hudi from source

Hudi requires Java 8 to be installed on a *nix system. Check out code and normally build the maven project, from command line:

# Checkout code and build
git clone https://github.com/apache/incubator-hudi.git && cd incubator-hudi
mvn clean package -DskipTests -DskipITs

Quickstart

Try https://hudi.apache.org/quickstart.html to quickly explore Hudi's capabilities using spark-shell.

Languages

Java 81.4%

Scala 16.7%

ANTLR 0.9%

Shell 0.8%

Dockerfile 0.2%