Go to file

n3nash 1a29d46a57 - Fix realtime queries by removing COLUMN_ID and COLUMN_NAME cache in inputformat (#814 )

- Hive on Spark will NOT work for RT tables after this patch

2019-08-02 16:06:34 -07:00

deploy

HUDI-125 : Change License for all source files and update RAT configurations

2019-06-09 11:41:55 -07:00

docker

HUDI-125 : Change License for all source files and update RAT configurations

2019-06-09 11:41:55 -07:00

hoodie-cli

HUDI-197 Hive Sync and othe CLIs using bundle picking sources jar instead of binary jar

2019-08-02 09:07:45 -07:00

hoodie-client

Cache RDD to avoid recomputing data ingestion. Return result RDD after updating index so that this step is not skipped by chained actions on the same RDD

2019-08-02 12:40:14 -07:00

hoodie-common

Allow HoodieWrapperFileSystem to wrap other proxy file-system implementations with no getScheme implementation (#793 )

2019-07-24 21:31:46 -07:00

hoodie-hadoop-mr

- Fix realtime queries by removing COLUMN_ID and COLUMN_NAME cache in inputformat (#814 )

2019-08-02 16:06:34 -07:00

hoodie-hive

HUDI-197 Hive Sync and othe CLIs using bundle picking sources jar instead of binary jar

2019-08-02 09:07:45 -07:00

hoodie-integ-test

HUDI-125 : Change License for all source files and update RAT configurations

2019-06-09 11:41:55 -07:00

hoodie-spark

HUDI-197 Hive Sync and othe CLIs using bundle picking sources jar instead of binary jar

2019-08-02 09:07:45 -07:00

hoodie-timeline-service

Ensure TableMetaClient and FileSystem instances have exclusive copy of Configuration

2019-06-20 14:05:00 -07:00

hoodie-utilities

HUDI-92 : Making deltastreamer with DistributedTestSource also run locally

2019-07-30 16:30:47 -07:00

packaging

Fix typo in hoodie-presto-bundle (#818 )

2019-08-01 08:51:57 -07:00

release/config

- Ugrading to Hive 2.x

2019-06-13 12:46:14 -07:00

style

General enhancements

2018-12-18 12:52:39 -08:00

_config.yml

Set theme jekyll-theme-minimal

2016-12-29 16:53:39 -08:00

.gitignore

HUDI-92 : Making deltastreamer with DistributedTestSource also run locally

2019-07-30 16:30:47 -07:00

.travis.yml

Auto generated Slack Channel Notifications setup

2019-06-07 06:46:00 -07:00

CHANGELOG.md

Added CHANGELOG.md and updated community contributions guideline

2017-06-16 10:48:37 -07:00

KEYS

HUDI-178 : Add keys for vinoth to KEYS file

2019-08-02 05:25:44 -07:00

LICENSE.txt

Importing Hoodie Client from internal repo

2016-12-16 14:34:42 -08:00

NOTICE.txt

Changes related to Licensing work

2019-06-07 17:58:57 -07:00

pom.xml

HUDI-70 : Making DeltaStreamer run in continuous mode with concurrent compaction

2019-06-18 17:48:14 -07:00

README.md

[HUDI-181] Fix the Bold markdown grammar issue of README file (#808 )

2019-07-30 03:47:53 -07:00

RELEASE_NOTES.md

Release notes for 0.4.7

2019-05-28 18:28:59 -07:00

README.md

Hudi

Apache Hudi (pronounced Hoodie) stands for Hadoop Upserts anD Incrementals. Hudi manages storage of large analytical datasets on DFS (Cloud stores, HDFS or any Hadoop FileSystem compatible storage) and provide ability to query them via three types of views

Read Optimized View - Provides excellent query performance via purely columnar storage (e.g. Parquet)
Incremental View - Provides a change stream with records inserted or updated after a point in time.
Real time View - Provides queries on real-time data, using a combination of columnar & row based storage (e.g Parquet + Avro)

For more, head over here

Languages

Java 81.4%

Scala 16.7%

ANTLR 0.9%

Shell 0.8%

Dockerfile 0.2%