1
0

Fixes HUDI-38: Reduce memory overhead of WriteStatus

- For implicit indexes (e.g BloomIndex), don't buffer up written records
 - By default, only collect 10% of failing records to avoid OOMs
 - Improves debuggability via above, since data errors can now show up in collect()
 - Unit tests & fixing subclasses & adjusting tests
This commit is contained in:
Vinoth Chandar
2019-03-26 14:31:19 -07:00
committed by vinoth chandar
parent e56c1612e4
commit f1410bfdcd
9 changed files with 112 additions and 23 deletions

View File

@@ -144,6 +144,10 @@ public class TestRawTripPayload implements HoodieRecordPayload<TestRawTripPayloa
private Map<String, String> mergedMetadataMap = new HashMap<>();
public MetadataMergeWriteStatus(Boolean trackSuccessRecords, Double failureFraction) {
super(trackSuccessRecords, failureFraction);
}
public static Map<String, String> mergeMetadataForWriteStatuses(List<WriteStatus> writeStatuses) {
Map<String, String> allWriteStatusMergedMetadataMap = new HashMap<>();
for (WriteStatus writeStatus : writeStatuses) {