1
0

[HUDI-3356][HUDI-3203] HoodieData for metadata index records; BloomFilter construction from index based on the type param (#4848)

Rework of #4761 
This diff introduces following changes:

- Write stats are converted to metadata index records during the commit. Making them use the HoodieData type so that the record generation scales up with needs. 
- Metadata index init support for bloom filter and column stats partitions.
- When building the BloomFilter from the index records, using the type param stored in the payload instead of hardcoded type.
- Delta writes can change column ranges and the column stats index need to be properly updated with new ranges to be consistent with the table dataset. This fix add column stats index update support for the delta writes.

Co-authored-by: Manoj Govindassamy <manoj.govindassamy@gmail.com>
This commit is contained in:
Sagar Sumit
2022-03-08 21:09:04 +05:30
committed by GitHub
parent ed26c5265c
commit 575bc63468
24 changed files with 1051 additions and 533 deletions

View File

@@ -97,7 +97,7 @@ public class HoodieWriteableTestTable extends HoodieMetadataTestTable {
return (HoodieWriteableTestTable) super.forCommit(instantTime);
}
public HoodieWriteableTestTable withInserts(String partition, String fileId, List<HoodieRecord> records, TaskContextSupplier contextSupplier) throws Exception {
public Path withInserts(String partition, String fileId, List<HoodieRecord> records, TaskContextSupplier contextSupplier) throws Exception {
FileCreateUtils.createPartitionMetaFile(basePath, partition);
String fileName = baseFileName(currentInstantTime, fileId);
@@ -151,7 +151,7 @@ public class HoodieWriteableTestTable extends HoodieMetadataTestTable {
}
}
return this;
return baseFilePath;
}
public Map<String, List<HoodieLogFile>> withLogAppends(List<HoodieRecord> records) throws Exception {