jira link https://issues.apache.org/jira/browse/HUDI-101 issue link https://github.com/apache/incubator-hudi/issues/516#issue-386048519 when using spark-shell with hoodie save data like : ``` ./spark-shell --master yarn --jars /home/hdfs/software/spark/hoodie/hoodie-spark-bundle-0.4.8-SNAPSHOT.jar --conf spark.sql.hive.convertMetastoreParquet=false --packages com.databricks:spark-avro_2.11:4.0.0 ``` and ``` inputDF.write.format("com.uber.hoodie") .option("hoodie.insert.shuffle.parallelism", "1") // any hoodie client config can be passed like this .option("hoodie.upsert.shuffle.parallelism", "1") // full list in HoodieWriteConfig & its package .option(DataSourceWriteOptions.STORAGE_TYPE_OPT_KEY, HoodieTableType.COPY_ON_WRITE.name()) .option(DataSourceWriteOptions.OPERATION_OPT_KEY, DataSourceWriteOptions.UPSERT_OPERATION_OPT_VAL) // insert .option(DataSourceWriteOptions.RECORDKEY_FIELD_OPT_KEY, "_row_key") .option(DataSourceWriteOptions.PARTITIONPATH_FIELD_OPT_KEY, "partition") .option(DataSourceWriteOptions.PRECOMBINE_FIELD_OPT_KEY, "extend_deal_date") .option(HoodieWriteConfig.TABLE_NAME, "c_upload_code") .mode(SaveMode.Overwrite) .save("/tmp/test/hoodie") ``` It also report error `Invalid signature file digest for Manifest main attributes`. Need to scan all infected dependency.
12 KiB
12 KiB