1
0
Commit Graph

615 Commits

Author SHA1 Message Date
lamber-ken
045fa87a3d [HUDI-330] add EmptyStatement java checkstyle rule 2019-11-13 14:11:11 -08:00
Balaji Varadarajan
8ff06ddb0f [HUDI-80] Leverage Commit metadata to figure out partitions to be cleaned for Cleaning by commits mode (#1008) 2019-11-12 06:12:44 -08:00
Udit Mehrotra
0bb5999f79 [HUDI-306] Support Glue catalog and other hive metastore implementations (#961)
- Support Glue catalog and other metastore implementations
- Remove shading from hudi utilities bundle
- Add maven profile to optionally shade hive in utilities bundle
2019-11-11 17:27:31 -08:00
Balaji Varadarajan
1032fc3e54 [HUDI-137] Hudi cleaning state changes should be consistent with compaction actions
Before this change, Cleaner performs cleaning of old file versions and then stores the deleted files in .clean files.
With this setup, we will not be able to track file deletions if a cleaner fails after deleting files but before writing .clean metadata.
This is fine for regular file-system view generation but Incremental timeline syncing relies on clean/commit/compaction metadata to keep a consistent file-system view.

Cleaner state transitions is now similar to that of compaction.

1. Requested : HoodieWriteClient.scheduleClean() selects the list of files that needs to be deleted and stores them in metadata
2. Inflight : HoodieWriteClient marks the state to be inflight before it starts deleting
3. Completed : HoodieWriteClient marks the state after completing the deletion according to the cleaner plan
2019-11-11 10:40:16 -08:00
Sivabalan Narayanan
23b303e4b1 [HUDI-218] Adding Presto support to Integration Test (#1003) 2019-11-11 06:21:49 -08:00
Pratyaksh Sharma
5f1309407a [HUDI-253]: added validations for schema provider class (#995) 2019-11-11 06:03:44 -08:00
vinoth chandar
1483b97018 [DOCS] Change Hudi acronyms to plural 2019-11-10 12:39:58 -08:00
Jeff G
1ce3d891ce [DOCS] Update to align with original Uber whitepaper (#999) 2019-11-10 12:38:13 -08:00
pratyakshsharma
0863b1cfd9 [HUDI-245]: replaced instances of getInstants() and reverse() with getReverseOrderedInstants() (#1000) 2019-11-07 08:42:48 -08:00
pratyakshsharma
20871a17b2 [HUDI-302]: simplified countInstants() method in HoodieDefaultTimeline (#997) 2019-11-06 12:56:09 -08:00
Gurudatt Kulkarni
71ac2c0d5e [HUDI-324] TimestampKeyGenerator should support milliseconds (#993) 2019-11-05 04:22:14 -08:00
Bhavani Sudha Saktheeswaran
04834817c8 [MINOR] Add features and instructions to build Hudi in README (#992) 2019-11-03 01:48:06 -08:00
Raymond Xu
91740635b2 [HUDI-321] Support bulkinsert in HDFSParquetImporter (#987)
- Add bulk insert feature
- Fix some minor issues
2019-11-02 23:12:44 -07:00
Wenning Ding
bd77dc792c Add MOR integration testing 2019-11-02 19:49:04 -07:00
Wenning Ding
b6057c5e0e [HUDI-314] Fix multi partition keys error when querying a realtime table 2019-11-02 19:49:04 -07:00
Balaji Varadarajan
a6390aefc4 [HUDI-312] Make docker hdfs cluster ephemeral. This is needed to fix flakiness in integration tests. Also, Fix DeltaStreamer hanging issue due to uncaught exception 2019-11-01 11:49:59 -07:00
dependabot[bot]
144ea4eedf Bump httpclient from 4.3.2 to 4.3.6 (#980)
Bumps httpclient from 4.3.2 to 4.3.6.

Signed-off-by: dependabot[bot] <support@github.com>
2019-11-01 05:22:31 -07:00
dependabot[bot]
74d8e625c5 Bump checkstyle from 8.8 to 8.18 (#981)
Bumps [checkstyle](https://github.com/checkstyle/checkstyle) from 8.8 to 8.18.
- [Release notes](https://github.com/checkstyle/checkstyle/releases)
- [Commits](https://github.com/checkstyle/checkstyle/compare/checkstyle-8.8...checkstyle-8.18)

Signed-off-by: dependabot[bot] <support@github.com>
2019-11-01 05:06:03 -07:00
Wenning Ding
ee0fd06de7 synchronized lock on conf object instead of class 2019-10-31 21:54:27 -07:00
Wenning Ding
3251d62bd3 [HUDI-313] Fix select count star error when querying a realtime table 2019-10-31 21:54:27 -07:00
Guru107
eda472adb0 [MINOR] Fix avro schema warnings in build 2019-10-31 21:49:38 -07:00
leesf
7c7403a59d [MINOR] fix annotation in teardown (#990) 2019-10-31 07:59:35 -07:00
leesf
b0838d25f7 [MINOR] Fix no output in travis (#984) 2019-10-29 21:17:45 -07:00
leesf
ef5001e432 [MINOR] Fix vm crashes (#979) 2019-10-28 16:25:07 -07:00
Balaji Varadarajan
c23da694cc [HUDI-169] Speed up rolling back of instants (#968) 2019-10-24 19:34:00 -07:00
Balaji Varadarajan
d8be818ac9 [HUDI-130] Paths written in compaction plan needs to be relative to base-path 2019-10-23 02:52:24 -07:00
vinoth chandar
e4c91ed13f [HUDI-290] Normalize test class name of all test classes (#951) 2019-10-22 20:19:11 -07:00
Gurudatt Kulkarni
031b067a3a [MINOR] Move all repository declarations to parent pom (#966) 2019-10-22 20:17:13 -07:00
Amit Prabhu
4529f535b2 [MINOR] Add backtick escape while syncing partition fields (#967) 2019-10-22 20:16:16 -07:00
Balaji Varadarajan
14dd649d06 [MINOR] Remove release notes and move confluent repository to hoodie parent pom 2019-10-21 14:16:05 -07:00
vinoth chandar
dfdc0e40e1 [HUDI-283] : Ensure a sane minimum for merge buffer memory (#964)
- Some environments e.g spark-shell provide 0 for memory size
- This causes unnecessary performance degradation
2019-10-20 21:00:04 -07:00
YanJia-Gary-Li
ed745dfdbf [HUDI-40] Add parquet support for the Delta Streamer (#949) 2019-10-16 21:11:59 -07:00
Balaji Varadarajan
7381b66194 [HUDI-121] Fix issues in release scripts 2019-10-16 03:33:57 -07:00
Balaji Varadarajan
603df66938 Update RELEASE Notes in master 2019-10-16 02:19:29 -07:00
Balaji Varadarajan
77f4e73615 [HUDI-121] Fix licensing issues found during RC voting by general incubator group 2019-10-16 02:09:02 -07:00
Mehrotra
8c13340062 Shade and relocate Avro dependency in hadoop-mr-bundle 2019-10-16 02:08:12 -07:00
Wenning Ding
1c09f5b055 [HUDI-301] fix path error when update a non-partition MOR table 2019-10-16 02:07:33 -07:00
Udit Mehrotra
12523c379f [HUDI-298] Fix issue with incorrect column mapping casusing bad data, during on-the-fly merge of Real Time tables (#956)
* Fix issue with incorrect column mapping casusing bad data, during on-the-fly merge of Real Time tables
2019-10-16 02:05:53 -07:00
Anurag870
c052167c06 [Docs] Update README.md (#955) 2019-10-13 21:02:25 -07:00
leesf
e10e06918e [HUDI-292] Avoid consuming more entries from kafka than specified sourceLimit. (#947)
- Special handling when allocedEvents > numEvents 
 - Added unit tests
2019-10-11 05:28:45 -07:00
leesf
b19bed442d [HUDI-296] Explore use of spotless to auto fix formatting errors (#945)
- Add spotless format fixing to project
- One time reformatting for conformity
- Build fails for formatting changes and mvn spotless:apply autofixes them
2019-10-10 05:19:40 -07:00
Balaji Varadarajan
834c591955 [MINOR] Add incubating to NOTICE and README.md
Please enter the commit message for your changes. Lines starting
2019-10-09 21:42:29 -07:00
vinoth chandar
e655cfba30 [HOTFIX] Move to openjdk to get travis passing (#944) 2019-10-08 12:06:47 -07:00
leesf
d050d98071 [HUDI-232] Implement sealing/unsealing for HoodieRecord class (#938) 2019-10-07 10:56:46 -07:00
Balaji Varadarajan
8a55938ca1 [HUDI-293] Remove KEYS file from github repository 2019-10-04 22:39:07 -07:00
Balaji Varadarajan
eb4cc05c1b [HUDI-121] Prepare for 0.5.0-incubating-rc5 2019-10-04 09:22:36 -07:00
Balaji Varadarajan
9b66ea41fd [HUDI-121] Remove leftover notice file and replace com.uber.hoodie with org.apache.hudi in log4j properties 2019-10-04 09:18:57 -07:00
leesf
7dd9c74b1b [HUDI-285] Implement HoodieStorageWriter based on actual file type (#936) 2019-10-04 07:45:16 -07:00
leesf
3dedc7e5fd [HUDI-265] Failed to delete tmp dirs created in unit tests (#928) 2019-10-03 09:48:13 -07:00
Balaji Varadarajan
cef06c1e48 [HUDI-121] Fix bug in validation in deploy_staging_jars.sh 2019-10-03 09:42:59 -07:00