- Fetching partition files or all partitions from the metadata table is failing
when run over S3. Metadata table uses HFile format for the base files and the
record lookup uses HFile.Reader and HFileScanner interfaces to get records by
partition keys. When the backing storage is S3, this record lookup from HFiles
is failing with IOException, in turn failing the caller commit/update operations.
- Metadata table looks up HFile records with positional read enabled so as to
perform better for random lookups. But this positional read key lookup is
returning with partial read sizes over S3 leading to HFile scanner throwing
IOException. This doesn't happen over HDFS. Metadata table though uses the HFile
for random key lookups, the positional read is not mandatory as we sort the keys
when doing a lookup for multiple keys.
- The fix is to disable HFile positional read for all HFile scanner based
key lookups.