[HUDI-1757] Assigns the buckets by record key for Flink writer (#2757)
Currently we assign the buckets by record partition path which could cause hotspot if the partition field is datetime type. Changes to assign buckets by grouping the record whth their key first, the assignment is valid if only there is no conflict(two task write to the same bucket). This patch also changes the coordinator execution to be asynchronous.
This commit is contained in:
@@ -41,7 +41,6 @@ public class HoodieIndexUtils {
|
||||
* Fetches Pair of partition path and {@link HoodieBaseFile}s for interested partitions.
|
||||
*
|
||||
* @param partition Partition of interest
|
||||
* @param context Instance of {@link HoodieEngineContext} to use
|
||||
* @param hoodieTable Instance of {@link HoodieTable} of interest
|
||||
* @return the list of {@link HoodieBaseFile}
|
||||
*/
|
||||
|
||||
Reference in New Issue
Block a user