1
0
Commit Graph

12 Commits

Author SHA1 Message Date
Satish Kotha
6dc03b65bf [HUDI-1075] Implement simple clustering strategies to create ClusteringPlan and to run the plan 2020-12-21 17:34:15 -08:00
satishkotha
33ec88fc38 [HUDI-1352] Add FileSystemView APIs to query pending clustering operations (#2202) 2020-11-05 08:49:58 -08:00
satishkotha
a99e93bed5 [HUDI-1072] Introduce REPLACE top level action. Implement insert_overwrite operation on top of replace action (#2048) 2020-09-29 17:04:25 -07:00
wenningd
8b928e9bca [HUDI-808] Support cleaning bootstrap source data (#1870)
Co-authored-by: Wenning Ding <wenningd@amazon.com>
Co-authored-by: Balaji Varadarajan <vbalaji@apache.org>
2020-08-11 01:43:46 -07:00
vinoth chandar
539621bd33 [HUDI-242] Support for RFC-12/Bootstrapping of external datasets to hudi (#1876)
- [HUDI-418] Bootstrap Index Implementation using HFile with unit-test
 - [HUDI-421] FileSystem View Changes to support Bootstrap with unit-tests
 - [HUDI-424] Implement Query Side Integration for querying tables containing bootstrap file slices
 - [HUDI-423] Implement upsert functionality for handling updates to these bootstrap file slices
 - [HUDI-421] Bootstrap Write Client with tests
 - [HUDI-425] Added HoodieDeltaStreamer support
 - [HUDI-899] Add a knob to change partition-path style while performing metadata bootstrap
 - [HUDI-900] Metadata Bootstrap Key Generator needs to handle complex keys correctly
 - [HUDI-424] Simplify Record reader implementation
 - [HUDI-423] Implement upsert functionality for handling updates to these bootstrap file slices
 - [HUDI-420] Hoodie Demo working with hive and sparkSQL. Also, Hoodie CLI working with bootstrap tables

Co-authored-by: Mehrotra <uditme@amazon.com>
Co-authored-by: Vinoth Chandar <vinoth@apache.org>
Co-authored-by: Balaji Varadarajan <varadarb@uber.com>
2020-08-03 20:19:21 -07:00
hongdd
8306205d7a [HUDI-332]Add operation type (insert/upsert/bulkinsert/delete) to HoodieCommitMetadata (#1157)
[HUDI-332]Add operation type (insert/upsert/bulkinsert/delete) to HoodieCommitMetadata (#1157)
2020-03-03 10:10:29 -08:00
Balaji Varadarajan
9a1f698eef [HUDI-308] Avoid Renames for tracking state transitions of all actions on dataset 2019-12-15 21:26:30 -08:00
Balaji Varadarajan
1032fc3e54 [HUDI-137] Hudi cleaning state changes should be consistent with compaction actions
Before this change, Cleaner performs cleaning of old file versions and then stores the deleted files in .clean files.
With this setup, we will not be able to track file deletions if a cleaner fails after deleting files but before writing .clean metadata.
This is fine for regular file-system view generation but Incremental timeline syncing relies on clean/commit/compaction metadata to keep a consistent file-system view.

Cleaner state transitions is now similar to that of compaction.

1. Requested : HoodieWriteClient.scheduleClean() selects the list of files that needs to be deleted and stores them in metadata
2. Inflight : HoodieWriteClient marks the state to be inflight before it starts deleting
3. Completed : HoodieWriteClient marks the state after completing the deletion according to the cleaner plan
2019-11-11 10:40:16 -08:00
Guru107
eda472adb0 [MINOR] Fix avro schema warnings in build 2019-10-31 21:49:38 -07:00
Balaji Varadarajan
d8be818ac9 [HUDI-130] Paths written in compaction plan needs to be relative to base-path 2019-10-23 02:52:24 -07:00
Balaji Varadarajan
58623631d4 [HUDI-249] Update Release-notes. Add sign-artifacts to POM and release related scripts. Add missing license headers 2019-09-13 08:41:29 -07:00
Balaji Varadarajan
a4f9d7575f HUDI-123 Rename code packages/constants to org.apache.hudi (#830)
- Rename com.uber.hoodie to org.apache.hudi
- Flag to pass com.uber.hoodie Input formats for hoodie-sync
- Works with HUDI demo. 
- Also tested for backwards compatibility with datasets built by com.uber.hoodie packages
- Migration guide : https://cwiki.apache.org/confluence/display/HUDI/Migration+Guide+From+com.uber.hoodie+to+org.apache.hudi
2019-08-11 17:48:17 -07:00