1
0

[DOCS] Update Hudi Readme (#1058)

- Add build status 
- Clean up layout
This commit is contained in:
lamber-ken
2019-12-03 01:25:43 +08:00
committed by vinoth chandar
parent 784e3ad0b6
commit ff688107fa

View File

@@ -15,11 +15,17 @@
limitations under the License. limitations under the License.
--> -->
# Hudi # Apache Hudi (Incubating)
Apache Hudi (Incubating) (pronounced Hoodie) stands for `Hadoop Upserts Deletes and Incrementals`. Apache Hudi (Incubating) (pronounced Hoodie) stands for `Hadoop Upserts Deletes and Incrementals`.
Hudi manages the storage of large analytical datasets on DFS (Cloud stores, HDFS or any Hadoop FileSystem compatible storage). Hudi manages the storage of large analytical datasets on DFS (Cloud stores, HDFS or any Hadoop FileSystem compatible storage).
### Features <http://hudi.apache.org/>
[![Build Status](https://travis-ci.org/apache/incubator-hudi.svg?branch=master)](https://travis-ci.org/apache/incubator-hudi)
[![License](https://img.shields.io/badge/license-Apache%202-4EB1BA.svg)](https://www.apache.org/licenses/LICENSE-2.0.html)
[![Maven Central](https://maven-badges.herokuapp.com/maven-central/org.apache.hudi/hudi/badge.svg)](http://search.maven.org/#search%7Cga%7C1%7Cg%3A%22org.apache.hudi%22)
## Features
* Upsert support with fast, pluggable indexing * Upsert support with fast, pluggable indexing
* Atomically publish data with rollback support * Atomically publish data with rollback support
* Snapshot isolation between writer & queries * Snapshot isolation between writer & queries
@@ -29,16 +35,20 @@ Hudi manages the storage of large analytical datasets on DFS (Cloud stores, HDFS
* Timeline metadata to track lineage * Timeline metadata to track lineage
Hudi provides the ability to query via three types of views: Hudi provides the ability to query via three types of views:
* **Read Optimized View** - Provides excellent snapshot query performance via purely columnar storage (e.g. [Parquet](https://parquet.apache.org/)) * **Read Optimized View** - Provides excellent snapshot query performance via purely columnar storage (e.g. [Parquet](https://parquet.apache.org/)).
* **Incremental View** - Provides a change stream with records inserted or updated after a point in time. * **Incremental View** - Provides a change stream with records inserted or updated after a point in time.
* **Real-time View** - Provides snapshot queries on real-time data, using a combination of columnar & row-based storage (e.g Parquet + [Avro](http://avro.apache.org/docs/current/mr.html)) * **Real-time View** - Provides snapshot queries on real-time data, using a combination of columnar & row-based storage (e.g [Parquet](https://parquet.apache.org/) + [Avro](http://avro.apache.org/docs/current/mr.html)).
Learn more about Hudi at [https://hudi.apache.org](https://hudi.apache.org) Learn more about Hudi at [https://hudi.apache.org](https://hudi.apache.org)
### Building Apache Hudi from source {#building-hudi} ## Building Apache Hudi from source {#building-hudi}
Hudi requires Java 8 to be installed on a *nix system. Check out [code](https://github.com/apache/incubator-hudi) and Prerequisites for building Apache Hudi:
normally build the maven project, from command line:
* Unix-like system (like Linux, Mac OS X)
* Java 8 (Java 9 or 10 may work)
* Git
* Maven
``` ```
# Checkout code and build # Checkout code and build
@@ -46,6 +56,6 @@ git clone https://github.com/apache/incubator-hudi.git && cd incubator-hudi
mvn clean package -DskipTests -DskipITs mvn clean package -DskipTests -DskipITs
``` ```
### Quickstart ## Quickstart
Try [https://hudi.apache.org/quickstart.html](https://hudi.apache.org/quickstart.html) to quickly explore Hudi's capabilities using spark-shell. Please visit [https://hudi.apache.org/quickstart.html](https://hudi.apache.org/quickstart.html) to quickly explore Hudi's capabilities using spark-shell.