[HUDI-1296] Support Metadata Table in Spark Datasource (#4789)
* Bootstrapping initial support for Metadata Table in Spark Datasource - Consolidated Avro/Row conversion utilities to center around Spark's AvroDeserializer ; removed duplication - Bootstrapped HoodieBaseRelation - Updated HoodieMergeOnReadRDD to be able to handle Metadata Table - Modified MOR relations to be able to read different Base File formats (Parquet, HFile)
This commit is contained in:
@@ -80,6 +80,23 @@ public class RawTripTestPayload implements HoodieRecordPayload<RawTripTestPayloa
|
||||
this.isDeleted = false;
|
||||
}
|
||||
|
||||
/**
|
||||
* @deprecated PLEASE READ THIS CAREFULLY
|
||||
*
|
||||
* Converting properly typed schemas into JSON leads to inevitable information loss, since JSON
|
||||
* encodes only representation of the record (with no schema accompanying it), therefore occasionally
|
||||
* losing nuances of the original data-types provided by the schema (for ex, with 1.23 literal it's
|
||||
* impossible to tell whether original type was Double or Decimal).
|
||||
*
|
||||
* Multiplied by the fact that Spark 2 JSON schema inference has substantial gaps in it (see below),
|
||||
* it's **NOT RECOMMENDED** to use this method. Instead please consider using {@link AvroConversionUtils#createDataframe()}
|
||||
* method accepting list of {@link HoodieRecord} (as produced by the {@link HoodieTestDataGenerator}
|
||||
* to create Spark's {@code Dataframe}s directly.
|
||||
*
|
||||
* REFs
|
||||
* https://medium.com/swlh/notes-about-json-schema-handling-in-spark-sql-be1e7f13839d
|
||||
*/
|
||||
@Deprecated
|
||||
public static List<String> recordsToStrings(List<HoodieRecord> records) {
|
||||
return records.stream().map(RawTripTestPayload::recordToString).filter(Option::isPresent).map(Option::get)
|
||||
.collect(Collectors.toList());
|
||||
|
||||
@@ -20,7 +20,43 @@
|
||||
"type": "record",
|
||||
"name": "User",
|
||||
"fields": [
|
||||
{"name": "field1", "type": ["null", "string"], "default": null},
|
||||
{"name": "createTime", "type": ["null", "long"], "default": null}
|
||||
{
|
||||
"name": "field1",
|
||||
"type": [
|
||||
"null",
|
||||
"string"
|
||||
],
|
||||
"default": null
|
||||
},
|
||||
{
|
||||
"name": "createTime",
|
||||
"type": [
|
||||
"null",
|
||||
"long"
|
||||
],
|
||||
"default": null
|
||||
},
|
||||
{
|
||||
"name": "createTimeString",
|
||||
"type": [
|
||||
"null",
|
||||
"string"
|
||||
],
|
||||
"default": null
|
||||
},
|
||||
{
|
||||
"name": "createTimeDecimal",
|
||||
"type": [
|
||||
"null",
|
||||
{
|
||||
"name": "decimalFixed",
|
||||
"type": "fixed",
|
||||
"logicalType": "decimal",
|
||||
"precision": 20,
|
||||
"scale": 4,
|
||||
"size": 10
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
Reference in New Issue
Block a user