- Add a new action called INDEX, whose state transition is described in the RFC.
- Changes in timeline to support the new action.
- Add an index planner in ScheduleIndexActionExecutor.
- Add index plan executor in RunIndexActionExecutor.
- Add 3 APIs in HoodieTableMetadataWriter; a) scheduleIndex: will generate an index plan based on latest completed instant, initialize file groups and add a requested INDEX instant, b) index: executes the index plan and also takes care of writes that happened after indexing was requested, c) dropIndex: will drop index by removing the given metadata partition.
- Add 2 new table configs to serve as the source of truth for inflight and completed indexes.
- Support upgrade/downgrade taking care of the newly added configs.
- Add tool to trigger indexing in HoodieIndexer.
- Handle corner cases related to partial failures.
- Abort gracefully after deleting partition and instant.
- Handle other actions in timeline to consider before catching up
As of now, delete partitions will ensure all file groups are deleted, but the partition as such is not deleted. So, get all partitions might be returning the deleted partitions as well. but no data will be served since all file groups are deleted. With this patch, we are fixing it. We are letting cleaner take care of deleting the partitions when all file groups pertaining to a partitions are deleted.
- Fixed the CleanPlanActionExecutor to return meta info about list of partitions to be deleted. If there are no valid file groups for a partition, clean planner will include the partition to be deleted.
- Fixed HoodieCleanPlan avro schema to include the list of partitions to be deleted
- CleanActionExecutor is fixed to delete partitions if any (as per clean plan)
- Same info is added to HoodieCleanMetadata
- Metadata table when applying clean metadata, will check for partitions to be deleted and will update the "all_partitions" record for the deleted partitions.
Co-authored-by: sivabalan <n.siva.b@gmail.com>
- This adds a restore plan and serializes it to restore.requested meta file in timeline. This also means that we are introducing schedule and execution phases for restore which was not present before.
* [HUDI-1295] Metadata Index - Bloom filter and Column stats index to speed up index lookups
- Today, base files have bloom filter at their footers and index lookups
have to load the base file to perform any bloom lookups. Though we have
interval tree based file purging, we still end up in significant amount
of base file read for the bloom filter for the end index lookups for the
keys. This index lookup operation can be made more performant by having
all the bloom filters in a new metadata partition and doing pointed
lookups based on keys.
* [HUDI-1295] Metadata Index - Bloom filter and Column stats index to speed up index lookups
- Adding indexing support for clean, restore and rollback operations.
Each of these operations will now be converted to index records for
bloom filter and column stats additionally.
* [HUDI-1295] Metadata Index - Bloom filter and Column stats index to speed up index lookups
- Making hoodie key consistent for both column stats and bloom index by
including fileId instead of fileName, in both read and write paths.
- Performance optimization for looking up records in the metadata table.
- Avoiding multi column sorting needed for HoodieBloomMetaIndexBatchCheckFunction
* [HUDI-1295] Metadata Index - Bloom filter and Column stats index to speed up index lookups
- HoodieBloomMetaIndexBatchCheckFunction cleanup to remove unused classes
- Base file checking before reading the file footer for bloom or column stats
* [HUDI-1295] Metadata Index - Bloom filter and Column stats index to speed up index lookups
- Updating the bloom index and column stats index to have full file name
included in the key instead of just file id.
- Minor test fixes.
* [HUDI-1295] Metadata Index - Bloom filter and Column stats index to speed up index lookups
- Fixed flink commit method to handle metadata table all partition update records
- TestBloomIndex fixes
* [HUDI-1295] Metadata Index - Bloom filter and Column stats index to speed up index lookups
- SparkHoodieBloomIndexHelper code simplification for various config modes
- Signature change for getBloomFilters() and getColumnStats(). Callers can
just pass in interested partition and file names, the index key is then
constructed internally based on the passed in parameters.
- KeyLookupHandle and KeyLookupResults code refactoring
- Metadata schema changes - removed the reserved field
* [HUDI-1295] Metadata Index - Bloom filter and Column stats index to speed up index lookups
- Removing HoodieColumnStatsMetadata and using HoodieColumnRangeMetadata instead.
Fixed the users of the the removed class.
* [HUDI-1295] Metadata Index - Bloom filter and Column stats index to speed up index lookups
- Extending meta index test to cover deletes, compactions, clean
and restore table operations. Also, fixed the getBloomFilters()
and getColumnStats() to account for deleted entries.
* [HUDI-1295] Metadata Index - Bloom filter and Column stats index to speed up index lookups
- Addressing review comments - java doc for new classes, keys sorting for
lookup, index methods renaming.
* [HUDI-1295] Metadata Index - Bloom filter and Column stats index to speed up index lookups
- Consolidated the bloom filter checking for keys in to one
HoodieMetadataBloomIndexCheckFunction instead of a spearate batch
and lazy mode. Removed all the configs around it.
- Made the metadata table partition file group count configurable.
- Fixed the HoodieKeyLookupHandle to have auto closable file reader
when checking bloom filter and range keys.
- Config property renames. Test fixes.
* [HUDI-1295] Metadata Index - Bloom filter and Column stats index to speed up index lookups
- Enabling column stats indexing for all columns by default
- Handling column stat generation errors and test update
* [HUDI-1295] Metadata Index - Bloom filter and Column stats index to speed up index lookups
- Metadata table partition file group count taken from the slices when
the table is bootstrapped.
- Prep records for the commit refactored to the base class
- HoodieFileReader interface changes for filtering keys
- Multi column and data types support for colums stats index
* [HUDI-1295] Metadata Index - Bloom filter and Column stats index to speed up index lookups
- rebase to latest master and merge fixes for the build and test failures
* [HUDI-1295] Metadata Index - Bloom filter and Column stats index to speed up index lookups
- Extending the metadata column stats type payload schema to include
more statistics about the column ranges to help query integration.
* [HUDI-1295] Metadata Index - Bloom filter and Column stats index to speed up index lookups
- Addressing review comments
This change is addressing issues in regards to Metadata Table observing ingesting duplicated records leading to it persisting incorrect file-sizes for the files referred to in those records.
There are multiple issues that were leading to that:
- [HUDI-3322] Incorrect Rollback Plan generation: Rollback Plan generated for MOR tables was overly expansively listing all log-files with the latest base-instant as the ones that have been affected by the rollback, leading to invalid MT records being ingested referring to those.
- [HUDI-3343] Metadata Table including Uncommitted Log Files during Bootstrap: Since MT is bootstrapped at the end of the commit operation execution (after FS activity, but before committing to the timeline), it was actually incorrectly ingesting some files that were part of the intermediate state of the operation being committed.
This change will unblock Stack of PRs based off #4556
- This patch introduces rollback plan and rollback.requested instant. Rollback will be done in two phases, namely rollback plan and rollback action. In planning, we prepare the rollback plan and serialize it to rollback.requested. In the rollback action phase, we fetch details from the plan and just delete the files as per the plan. This will ensure final rollback commit metadata will contain all files that got rolled back even if rollback failed midway and retried again.
- fix problem of archiving replace commits
- Fix problem when getting empty replacecommit.requested
- Improved the logic of handling empty and non-empty requested/inflight commit files. Added unit tests to cover both empty and non-empty inflight files cases and cleaned up some unused test util methods
Co-authored-by: yorkzero831 <yorkzero8312@gmail.com>
Co-authored-by: zheren.yu <zheren.yu@paypay-corp.co.jp>
- Adds field to RollbackMetadata that capture the logs written for rollback blocks
- Adds field to RollbackMetadata that capture new logs files written by unsynced deltacommits
Co-authored-by: Vinoth Chandar <vinoth@apache.org>
- Introduced an internal metadata table, that stores file listings.
- metadata table is kept upto date with
- Fixed handling of CleanerPlan.
- [HUDI-842] Reduce parallelism to speed up the test.
- [HUDI-842] Implementation of CLI commands for metadata operations and lookups.
- [HUDI-842] Extend rollback metadata to include the files which have been appended to.
- [HUDI-842] Support for rollbacks in MOR Table.
- MarkerBasedRollbackStrategy needs to correctly provide the list of files for which rollback blocks were appended.
- [HUDI-842] Added unit test for rollback of partial commits (inflight but not completed yet).
- [HUDI-842] Handled the error case where metadata update succeeds but dataset commit fails.
- [HUDI-842] Schema evolution strategy for Metadata Table. Each type of metadata saved (FilesystemMetadata, ColumnIndexMetadata, etc.) will be a separate field with default null. The type of the record will identify the valid field. This way, we can grow the schema when new type of information is saved within in which still keeping it backward compatible.
- [HUDI-842] Fix non-partitioned case and speedup initial creation of metadata table.Choose only 1 partition for jsc as the number of records is low (hundreds to thousands). There is more overhead of creating large number of partitions for JavaRDD and it slows down operations like WorkloadProfile.
For the non-partitioned case, use "." as the name of the partition to prevent empty keys in HFile.
- [HUDI-842] Reworked metrics pusblishing.
- Code has been split into reader and writer side. HoodieMetadata code to be accessed by using HoodieTable.metadata() to get instance of metdata for the table.
Code is serializable to allow executors to use the functionality.
- [RFC-15] Add metrics to track the time for each file system call.
- [RFC-15] Added a distributed metrics registry for spark which can be used to collect metrics from executors. This helps create a stats dashboard which shows the metadata table improvements in real-time for production tables.
- [HUDI-1321] Created HoodieMetadataConfig to specify configuration for the metadata table. This is safer than full-fledged properties for the metadata table (like HoodieWriteConfig) as it makes burdensome to tune the metadata. With limited configuration, we can control the performance of the metadata table closely.
[HUDI-1319][RFC-15] Adding interfaces for HoodieMetadata, HoodieMetadataWriter (apache#2266)
- moved MetadataReader to HoodieBackedTableMetadata, under the HoodieTableMetadata interface
- moved MetadataWriter to HoodieBackedTableMetadataWriter, under the HoodieTableMetadataWriter
- Pulled all the metrics into HoodieMetadataMetrics
- Writer now wraps the metadata, instead of extending it
- New enum for MetadataPartitionType
- Streamlined code flow inside HoodieBackedTableMetadataWriter w.r.t initializing metadata state
- [HUDI-1319] Make async operations work with metadata table (apache#2332)
- Changes the syncing model to only move over completed instants on data timeline
- Syncing happens postCommit and on writeClient initialization
- Latest delta commit on the metadata table is sufficient as the watermark for data timeline archival
- Cleaning/Compaction use a suffix to the last instant written to metadata table, such that we keep the 1-1
- .. mapping between data and metadata timelines.
- Got rid of a lot of the complexity around checking for valid commits during open of base/log files
- Tests now use local FS, to simulate more failure scenarios
- Some failure scenarios exposed HUDI-1434, which is needed for MOR to work correctly
co-authored by: Vinoth Chandar <vinoth@apache.org>
- [HUDI-418] Bootstrap Index Implementation using HFile with unit-test
- [HUDI-421] FileSystem View Changes to support Bootstrap with unit-tests
- [HUDI-424] Implement Query Side Integration for querying tables containing bootstrap file slices
- [HUDI-423] Implement upsert functionality for handling updates to these bootstrap file slices
- [HUDI-421] Bootstrap Write Client with tests
- [HUDI-425] Added HoodieDeltaStreamer support
- [HUDI-899] Add a knob to change partition-path style while performing metadata bootstrap
- [HUDI-900] Metadata Bootstrap Key Generator needs to handle complex keys correctly
- [HUDI-424] Simplify Record reader implementation
- [HUDI-423] Implement upsert functionality for handling updates to these bootstrap file slices
- [HUDI-420] Hoodie Demo working with hive and sparkSQL. Also, Hoodie CLI working with bootstrap tables
Co-authored-by: Mehrotra <uditme@amazon.com>
Co-authored-by: Vinoth Chandar <vinoth@apache.org>
Co-authored-by: Balaji Varadarajan <varadarb@uber.com>
Before this change, Cleaner performs cleaning of old file versions and then stores the deleted files in .clean files.
With this setup, we will not be able to track file deletions if a cleaner fails after deleting files but before writing .clean metadata.
This is fine for regular file-system view generation but Incremental timeline syncing relies on clean/commit/compaction metadata to keep a consistent file-system view.
Cleaner state transitions is now similar to that of compaction.
1. Requested : HoodieWriteClient.scheduleClean() selects the list of files that needs to be deleted and stores them in metadata
2. Inflight : HoodieWriteClient marks the state to be inflight before it starts deleting
3. Completed : HoodieWriteClient marks the state after completing the deletion according to the cleaner plan