1
0

[HUDI-2194] Skip the latest N partitions when choosing partitions to create ClusteringPlan (#3300)

* skip from latest partitions based on hoodie.clustering.plan.strategy.daybased.skipfromlatest.partitions && 0(default means skip nothing)

* change config verison

* add ut

Co-authored-by: yuezhang <yuezhang@freewheel.tv>
This commit is contained in:
zhangyue19921010
2021-08-10 01:10:15 +08:00
committed by GitHub
parent 41a9986a76
commit b4441abcf7
4 changed files with 83 additions and 0 deletions

View File

@@ -51,8 +51,10 @@ public class SparkRecentDaysClusteringPlanStrategy<T extends HoodieRecordPayload
protected List<String> filterPartitionPaths(List<String> partitionPaths) {
int targetPartitionsForClustering = getWriteConfig().getTargetPartitionsForClustering();
int skipPartitionsFromLatestForClustering = getWriteConfig().getSkipPartitionsFromLatestForClustering();
return partitionPaths.stream()
.sorted(Comparator.reverseOrder())
.skip(Math.max(skipPartitionsFromLatestForClustering, 0))
.limit(targetPartitionsForClustering > 0 ? targetPartitionsForClustering : partitionPaths.size())
.collect(Collectors.toList());
}