1
0
Files
hudi/hudi-common
Manoj Govindassamy 383d5edc16 [HUDI-2894][HUDI-2905] Metadata table - avoiding key lookup failures on base files over S3 (#4185)
- Fetching partition files or all partitions from the metadata table is failing
   when run over S3. Metadata table uses HFile format for the base files and the
   record lookup uses HFile.Reader and HFileScanner interfaces to get records by
   partition keys. When the backing storage is S3, this record lookup from HFiles
   is failing with IOException, in turn failing the caller commit/update operations.

 - Metadata table looks up HFile records with positional read enabled so as to
   perform better for random lookups. But this positional read key lookup is
   returning with partial read sizes over S3 leading to HFile scanner throwing
   IOException. This doesn't happen over HDFS. Metadata table though uses the HFile
   for random key lookups, the positional read is not mandatory as we sort the keys
   when doing a lookup for multiple keys.

 - The fix is to disable HFile positional read for all HFile scanner based
   key lookups.
2021-12-03 14:18:10 -05:00
..