1
0

Commit Graph

  • 3b2da9f138 [HUDI-2631] In CompactFunction, set up the write schema each time with the latest schema (#4000) yuzhaojing 2022-03-02 11:18:17 +08:00
  • 3cfb52c413 [MINOR] fix get builtin function issue from Hudi catalog (#4917) stayrascal 2022-03-02 11:16:19 +08:00
  • 3fdc9332e5 [HUDI-3516] Implement record iterator for HoodieDataBlock (#4909) Bo Cui 2022-03-02 10:19:36 +08:00
  • a81a6326d5 [HUDI-3441] Add support for "marker delete" in hudi-cli (#4922) ForwardXu 2022-03-01 16:03:53 +08:00
  • f7088a957c [HUDI-3497] Adding Datatable validator tool (#4902) Sivabalan Narayanan 2022-02-28 22:46:32 -05:00
  • 257052a94d [HUDI-3465] Add validation of column stats and bloom filters in HoodieMetadataTableValidator (#4878) Y Ethan Guo 2022-02-28 18:49:30 -08:00
  • 44b8ab6048 [HUDI-3418] Save timeout option for remote RemoteFileSystemView (#4809) yuzhaojing 2022-03-01 04:16:40 +08:00
  • 18dc89cf79 [HUDI-3450] Avoid passing empty string spark master to hudi cli (#4844) wenningd 2022-02-28 08:37:24 -08:00
  • 05e395ae5f [HUDI-3341] Fix log file reader for S3 with hadoop-aws 2.7.x (#4897) Y Ethan Guo 2022-02-28 08:14:35 -08:00
  • 8f1e4f5b3e [HUDI-3528] Fix String convert issue and overwrite putAll method in TypedProperties.java (#4920) stayrascal 2022-02-28 23:45:47 +08:00
  • 4a59876c8b [HUDI-2917] rollback insert data appended to log file when using Hbase Index (#4840) Sivabalan Narayanan 2022-02-28 08:13:17 -05:00
  • 193215201c [MINOR] Change MINI_BATCH_SIZE to 2048 (#4862) Bo Cui 2022-02-28 10:45:28 +08:00
  • d5444ff7ff [HUDI-3018] Adding validation to dataframe scheme to ensure reserved field does not have diff data type (#4852) Sivabalan Narayanan 2022-02-27 11:59:23 -05:00
  • 2f99e8458a [HUDI-3521] Fixing kakfa key and value serializer value type from class to string (#4919) Sivabalan Narayanan 2022-02-27 11:13:13 -05:00
  • c77b2591d0 [HUDI-2439] Remove SparkBoundedInMemoryExecutor (#4860) Raymond Xu 2022-02-26 05:02:12 -08:00
  • 1379300b5b [HUDI-3483] Adding insert override nodes to integ test suite and few clean ups (#4895) Sivabalan Narayanan 2022-02-26 08:00:15 -05:00
  • 6a5cfb45b9 [MINOR] Fix table type in input format test (#4912) Sagar Sumit 2022-02-26 00:21:53 +05:30
  • 92cdc5987a [HUDI-3515] Making rdd unpersist optional at the end of writes (#4898) 苏承祥 2022-02-26 00:30:10 +08:00
  • b50f4b491c [HUDI-3042] Refactor clustering executors (#4847) Raymond Xu 2022-02-25 05:39:43 -08:00
  • 742810070b [HUDI-3421]Pending clustering may break AbstractTableFileSystemView#getxxBaseFile() (#4810) YueZhang 2022-02-25 19:16:27 +08:00
  • a4ee7463ae [HUDI-3474] Add more document to Pipelines for the usage of this tool to build a write pipeline (#4906) Danny Chan 2022-02-25 19:08:51 +08:00
  • 45d1216e91 [HUDI-3401] fix NPE caused by incorrect beforeKeyGenClassName validation (#4774) todd5167 2022-02-25 12:31:29 +08:00
  • 3694485609 [HUDI-3429] Support clustering scheduleAndExecute for hudi-cli and add clustering-cli Tests (#4817) YueZhang 2022-02-25 12:28:38 +08:00
  • aa1810d737 [HUDI-3493] Not table to get execution plan (#4894) ForwardXu 2022-02-25 09:04:44 +08:00
  • 85e8a5c4de [HUDI-1296] Support Metadata Table in Spark Datasource (#4789) Alexey Kudinkin 2022-02-24 13:23:13 -08:00
  • 521338b4d9 [HUDI-3161] Add Call Produce Command for Spark SQL (#4535) ForwardXu 2022-02-24 23:45:37 +08:00
  • 943b99775b [HUDI-3488] The flink small file list should exclude file slices with pending compaction (#4893) yanenze 2022-02-24 14:45:03 +08:00
  • 62605be413 [HUDI-3480][HUDI-3481] Enchancements to integ test suite (#4884) Sivabalan Narayanan 2022-02-23 15:56:35 -05:00
  • 2a93b8efb2 [HUDI-3489] Unify config to avoid duplicate code (#4883) leesf 2022-02-23 21:14:30 +08:00
  • 4e8accc179 [HUDI-3486] Fix wrong field order for constructing HoodieMetadataColumnStats (#4875) Y Ethan Guo 2022-02-22 20:57:02 -08:00
  • dabae80423 [HUDI-3420] Remove duplicates type in HoodieClusteringGroup.avsc (#4808) yuzhaojing 2022-02-23 10:49:47 +08:00
  • 01cbddef78 Add hive-standalone-metastore dependency to hudi-flink-bundle module (#4870) 从大数据到人工智能 2022-02-23 09:16:21 +08:00
  • 9678c3fbcf [MINOR] Fixing checkpoint management in S3IncrSource (#4871) Sivabalan Narayanan 2022-02-22 09:15:16 -05:00
  • b87e95d621 [HUDI-3476] Remove the shade pattern for parquet for flink bundle jar (#4869) Danny Chan 2022-02-22 19:21:57 +08:00
  • 4affdd0c8f [HUDI-3461] The archived timeline for flink streaming reader should not be reused (#4861) Danny Chan 2022-02-22 15:54:29 +08:00
  • 4d1f74ebea [HUDI-3464] Fix wrong exception thrown from HiveSchemaProvider (#4865) wangxianghu 2022-02-22 10:20:20 +04:00
  • 14dbbdf4c7 [HUDI-2189] Adding delete partitions support to DeltaStreamer (#4787) Sivabalan Narayanan 2022-02-22 00:01:30 -05:00
  • 7e1ea06eb9 [MINOR] Fix typos and improve docs in HoodieMetadataConfig (#4867) Y Ethan Guo 2022-02-21 19:36:20 -08:00
  • 0dee8edc97 [HUDI-2925] Fix duplicate cleaning of same files when unfinished clean operations are present using a config. (#4212) Prashant Wason 2022-02-21 18:53:03 -08:00
  • 0c950181aa [HUDI-3423] upgrade spark to 3.2.1 (#4815) Yann Byron 2022-02-22 08:52:21 +08:00
  • 801fdab55c [HUDI-3042] Abstract Spark update Strategy to make code more clean and remove duplicates (#4845) RexAn 2022-02-21 22:53:09 +08:00
  • bf16bc122a [HUDI-349]: Added new cleaning policy based on number of hours (#3646) Pratyaksh Sharma 2022-02-21 19:34:42 +05:30
  • d36fe24c9e [HUDI-3455] Fixing checkpoint management in hoodie incr source (#4850) Sivabalan Narayanan 2022-02-21 08:19:57 -05:00
  • 17cb5cb433 [HUDI-3432] Fixing restore with metadata enabled (#4849) Sivabalan Narayanan 2022-02-21 07:55:30 -05:00
  • 76b6ad6491 [HUDI-2732][RFC-38] Spark Datasource V2 Integration (#3964) leesf 2022-02-21 20:14:07 +08:00
  • 359fbfde79 [HUDI-2648] Retry FileSystem action instead of failed directly. (#3887) YueZhang 2022-02-21 04:31:31 +08:00
  • 0938f55a2b [HUDI-3458] Fix BulkInsertPartitioner generic type (#4854) Raymond Xu 2022-02-20 10:51:58 -08:00
  • 66ac1446dd [MINOR] Moving spark scheduling configs out of DataSourceOptions (#4843) Sivabalan Narayanan 2022-02-20 13:49:18 -05:00
  • 83279971a1 [HUDI-3446] Supports batch reader in BootstrapOperator#loadRecords (#4837) Bo Cui 2022-02-19 21:21:48 +08:00
  • f15125c0cd [HUDI-3389] fix ColumnarArrayData ClassCastException issue (#4842) stayrascal 2022-02-19 10:56:41 +08:00
  • 5009138d04 [HUDI-3438] Avoid getSmallFiles if hoodie.parquet.small.file.limit is 0 (#4823) RexAn 2022-02-18 21:57:04 +08:00
  • fba5822ee3 [HUDI-3430] Fix Deltastreamer to properly shut down the services upon failure (#4824) Y Ethan Guo 2022-02-18 05:44:56 -08:00
  • de8161ae96 HoodieSortedMergeHandle#close write data disorder (#4841) luokey 2022-02-18 17:31:38 +08:00
  • ed106f671e [HUDI-2809] Introduce a checksum mechanism for validating hoodie.properties (#4712) Sagar Sumit 2022-02-18 10:17:06 +05:30
  • 2844a77b43 [HUDI-3439] Remove the hive shade pattern for flink bundle jar (#4833) Danny Chan 2022-02-17 22:42:39 +08:00
  • 433c2573ef [HUDI-3442]Duplicate code calls for 'FlinkOptions.flatOptions' (#4832) zhangxiang17 2022-02-17 11:04:09 +08:00
  • ba0afe1426 [HUDI-3426] Sync datasource clustering config (#4828) Sagar Sumit 2022-02-17 05:32:49 +05:30
  • aaddaf524a [HUDI-3280] Cleaning up Hive-related hierarchies after refactoring (#4743) Alexey Kudinkin 2022-02-16 15:36:37 -08:00
  • 3363c66468 [HUDI-3394] Check isWriteLockedByCurrentThread before unlock for InProcessLockProvider (#4819) YueZhang 2022-02-16 14:41:25 +08:00
  • 9a05940a74 [HUDI-3366] Remove hardcoded logic of disabling metadata table in tests (#4792) Y Ethan Guo 2022-02-15 13:41:47 -08:00
  • 538ec44fa8 [HUDI-2931] Add config to disable table services (#4777) Raymond Xu 2022-02-15 06:49:53 -08:00
  • fe02c64fea fix build & ci (#4822) Yann Byron 2022-02-15 19:40:40 +08:00
  • cb6ca7f0d1 [HUDI-3204] fix problem that spark on TimestampKeyGenerator has no re… (#4714) Yann Byron 2022-02-15 12:38:38 +08:00
  • 27bd7b538e [HUDI-1576] Make archiving an async service (#4795) Raymond Xu 2022-02-14 18:15:06 -08:00
  • 3b401d839c [HUDI-3200] deprecate hoodie.file.index.enable and unify to use BaseFileOnlyViewRelation to handle (#4798) Yann Byron 2022-02-15 09:38:01 +08:00
  • 0a97a9893a [HUDI-3398] Fix TableSchemaResolver for all file formats and metadata table (#4782) YueZhang 2022-02-15 08:02:47 +08:00
  • e639d99387 [HUDI-1657] Fix the build on aarch64, Fedora 33 (#4617) Yuqi Gu 2022-02-15 07:10:18 +08:00
  • bcfd8efe66 [MINOR] Prevent async service from starting twice (#4801) Raymond Xu 2022-02-14 11:06:31 -08:00
  • 0db1e978c6 [HUDI-3254] Introduce HoodieCatalog to manage tables for Spark Datasource V2 (#4611) leesf 2022-02-14 22:26:58 +08:00
  • 5ca4480a38 [HUDI-3417] Switch AbstractTableFileSystemView#filterBaseFileAfterPendingCompaction log level to debug (#4805) yuzhaojing 2022-02-14 16:18:34 +08:00
  • 94806d5cf7 [HUDI-3272] If mode==ignore && tableExists, do not execute write logic and sync hive (#4632) 董可伦 2022-02-14 11:52:00 +08:00
  • 93ee09fee8 [HUDI-3412] TypedProperties no need to create new set when check key exist or not (#4791) RexAn 2022-02-14 11:33:29 +08:00
  • 76e2faa28d [HUDI-3370] The files recorded in the commit may not match the actual ones for MOR Compaction (#4753) YueZhang 2022-02-14 11:12:52 +08:00
  • 55777fec05 [HUDI-2413] fix Sql source's checkpoint issue (#3648) 冯健 2022-02-14 10:37:48 +08:00
  • 6aba00e84f [MINOR] Fix typos in Spark client related classes (#4781) Y Ethan Guo 2022-02-13 06:41:58 -08:00
  • ce9762d588 [MINOR] unused import (#4799) wangxianghu 2022-02-12 13:11:37 +04:00
  • 9518f78610 [HUDI-3413]fix jackson parse error when empty message from JsonKafkaSource Using HoodieDeltaStreamer (#4794) zhangxiang17 2022-02-12 15:37:29 +08:00
  • 89ed6f062e [HUDI-3362] Fix restore to rollback pending clustering operations followed by other rolling back other commits (#4772) satishkotha 2022-02-11 11:12:45 -08:00
  • b431246710 [HUDI-3338] Custom relation instead of HadoopFsRelation (#4709) Yann Byron 2022-02-12 02:48:44 +08:00
  • 10474e0962 [HUDI-3402] Set TIMESTAMP_MICROS as the default value for hoodie.parquet.outputtimestamptype (#4749) Yann Byron 2022-02-12 01:23:55 +08:00
  • ba4e732ba7 [HUDI-2987] Update all deprecated calls to new apis in HoodieRecordPayload (#4681) Sivabalan Narayanan 2022-02-10 19:19:33 -05:00
  • 2fe7a3a41f [HUDI-2610] pass the spark version when sync the table created by spark (#4758) Yann Byron 2022-02-10 23:35:28 +08:00
  • 1c778590d1 [HUDI-3395] Allow pass rollbackUsingMarkers to Hudi CLI rollback command (#4557) wenningd 2022-02-10 06:41:22 -08:00
  • d971974063 [HUDI-3333] fix that getNestedFieldVal breaks with Spark 3.2 (#4783) Yann Byron 2022-02-10 22:12:16 +08:00
  • e7ec3a82dc [HUDI-2432] Adding restore.requested instant and restore plan for restore action (#4605) Sivabalan Narayanan 2022-02-10 08:06:23 -05:00
  • 0ababcfaa7 [HUDI-1847] Adding inline scheduling support for spark datasource path for compaction and clustering (#4420) Sivabalan Narayanan 2022-02-10 08:04:55 -05:00
  • b3b44236fe [HUDI-3389] Bump flink version to 1.14.3 (#4776) Danny Chan 2022-02-10 11:32:01 +08:00
  • 464027ec37 [HUDI-3239] Convert BaseHoodieTableFileIndex to Java (#4669) Alexey Kudinkin 2022-02-09 15:42:08 -08:00
  • 973087f385 [HUDI-3276] Rebased Parquet-based FileInputFormat impls to inherit from MapredParquetInputFormat (#4667) Alexey Kudinkin 2022-02-08 12:21:45 -08:00
  • 60831d6906 [HUDI-3361] Fixing missing begin checkpoint in HoodieIncremental pull (#4755) Sivabalan Narayanan 2022-02-08 12:03:07 -05:00
  • 6a32cfe020 [HUDI-3091] Making SIMPLE index as the default index type (#4659) Sivabalan Narayanan 2022-02-08 04:32:18 -05:00
  • ab73047958 Adding support for custom scheduler configs with streaming sink (#4762) Sivabalan Narayanan 2022-02-08 04:14:10 -05:00
  • 1636876e8a [HUDI-3320] Hoodie metadata table validator (#4721) YueZhang 2022-02-08 16:29:44 +08:00
  • 0ab1a8ec80 [HUDI-3312] Fixing spark yaml and adding hive validation to integ test suite (#4731) Sivabalan Narayanan 2022-02-08 00:40:36 -05:00
  • 8ab6f17149 [HUDI-3373] Add zero value metrics for empty data source and PROMETHEUS_PUSHGATEWAY reporter (#4760) Vinish Reddy 2022-02-08 01:47:46 +05:30
  • 3bd8fc1c3e [HUDI-3058] Simplify Precommit file system view (#4570) satishkotha 2022-02-07 12:16:50 -08:00
  • 3f263b82ce [HUDI-3206] Unify Hive's MOR implementations to avoid duplication (#4559) Alexey Kudinkin 2022-02-07 11:06:28 -08:00
  • 773b317983 [HUDI-2941] Show _hoodie_operation in spark sql results (#4649) ForwardXu 2022-02-07 22:28:13 +08:00
  • 24f738fe68 [HUDI-3360] Adding retries to deltastreamer for source errors (#4744) Sivabalan Narayanan 2022-02-07 08:10:06 -05:00
  • 538db185ca [HUDI-2491] Expose HMS mode metastore uri config option for spark writer (#3962) ehui 2022-02-07 20:43:51 +08:00