1
0

Commit Graph

  • 672974c412 [HUDI-3823] Fix hudi-hive-sync-bundle to include HBase dependencies and shading (#5257) Y Ethan Guo 2022-04-07 17:30:33 -07:00
  • ef06e4a526 [HUDI-3810] Fixing lazy read for metadata log record readers (#5241) Sivabalan Narayanan 2022-04-07 15:40:51 -07:00
  • cd2c346df6 [HUDI-3637] Exclude uncommitted log files from metadata table validation (#5234) Y Ethan Guo 2022-04-07 13:03:03 -07:00
  • b3c834a242 [HUDI-3571] Spark datasource continuous ingestion tool (#5156) Sivabalan Narayanan 2022-04-07 11:13:46 -07:00
  • 6a8396420c [HUDI-3643] Fix hive count exception when the table is empty and the path depth is less than 3 (#5051) 董可伦 2022-04-07 19:21:03 +08:00
  • 9d744bb35c [HUDI-3805] Delete existing corrupted requested rollback plan during rollback (#5245) Y Ethan Guo 2022-04-07 03:02:34 -07:00
  • 531381faff [HUDI-3096] fixed the bug that the cow table(contains decimalType) write by flink cannot be read by spark. (#4421) xiarixiaoyao 2022-04-07 17:21:25 +08:00
  • e33149be9a [HUDI-3808] Flink bulk_insert timestamp(3) can not be read by Spark (#5236) Danny Chan 2022-04-07 15:17:39 +08:00
  • d43b4cd95e [HUDI-3739] Fix handling of the isNotNull predicate in Data Skipping (#5224) Alexey Kudinkin 2022-04-06 12:17:36 -07:00
  • b2f09a1fee [HUDI-3340] Fix deploy_staging_jars command (#5243) Raymond Xu 2022-04-06 12:14:23 -07:00
  • 939b3d1b07 [HUDI-3726] Switching from non-partitioned to partitioned key gen does not throw any exception (#5205) rkkalluri 2022-04-06 12:35:32 -05:00
  • ca273274b0 [HUDI-3340] Fix deploy_staging_jars for different profiles (#5240) Raymond Xu 2022-04-06 09:42:11 -07:00
  • 9e87d164b3 [HUDI-3760] Adding capability to fetch Metadata Records by prefix (#5208) Alexey Kudinkin 2022-04-06 09:11:08 -07:00
  • 7612549bcc [MINOR] Fixing build failure when using flink-1.13 (#5214) BruceLin 2022-04-06 16:07:20 +08:00
  • 8683fb1d49 [HUDI-3800] Fixed preserve commit metadata for compaction for untouched records (#5232) Sivabalan Narayanan 2022-04-06 00:56:53 -07:00
  • e96f08f355 Moving to 0.12.0-SNAPSHOT on master branch. Raymond Xu 2022-04-06 15:24:10 +08:00
  • 8baeb816d5 [HUDI-3723] Fixed stack overflows in Record Iterators (#5235) Alexey Kudinkin 2022-04-05 20:12:13 -07:00
  • 898be6174a [HUDI-3782] Fixing table config when any of the index is disabled (#5222) Sagar Sumit 2022-04-06 08:36:52 +05:30
  • 92ca426ab7 [HUDI-2319] dbt example models to demonstrate hudi dbt integration (#5220) Vinoth Govindarajan 2022-04-05 08:58:13 -07:00
  • 3195f51562 [HUDI-3748] write and select hudi table when enable hoodie.datasource.write.drop.partition.columns (#5201) Yann Byron 2022-04-05 16:31:41 +08:00
  • 325b3d610a [HUDI-3795] Fix hudi-examples checkstyle and maven enforcer error (#5221) ForwardXu 2022-04-05 16:10:11 +08:00
  • 3449e86989 [HUDI-3780] improve drop partitions (#5178) ForwardXu 2022-04-05 11:52:33 +08:00
  • b28f0d6ceb [HUDI-3290] Different file formats for the partition metadata file. (#5179) Prashant Wason 2022-04-04 08:08:20 -07:00
  • 8add740d22 [HUDI-3534] [RFC-34] Added the implementation details for the BigQuery integration (#4503) Vinoth Govindarajan 2022-04-03 03:53:25 -07:00
  • c34eb07598 [MINOR] Reuse deleteMetadataTable for disabling metadata table (#5217) Y Ethan Guo 2022-04-03 03:42:14 -07:00
  • 84064a9b08 [HUDI-3772] Fixing auto adjustment of lock configs for deltastreamer (#5207) Sivabalan Narayanan 2022-04-02 23:44:10 -07:00
  • cc3737be50 [HUDI-3664] Fixing Column Stats Index composition (#5181) Alexey Kudinkin 2022-04-02 17:15:52 -07:00
  • 74eb09be9b [HUDI-3776] Fix BloomIndex incorrectly using ColStats to lookup records locations (#5213) Sagar Sumit 2022-04-03 03:52:57 +05:30
  • 20964df770 [HUDI-3357] MVP implementation of BigQuerySyncTool (#5125) Vinoth Govindarajan 2022-04-02 13:18:06 -07:00
  • c19f505b5a [HUDI-3784] Improve docs and logs of HoodieMetadataTableValidator (#5216) Y Ethan Guo 2022-04-02 13:16:17 -07:00
  • eef3f9c74a [HUDI-3771] flink supports sync table information to aws glue (#5202) todd5167 2022-04-02 21:16:10 +08:00
  • 020786a5f9 [HUDI-3451] Delete metadata table when the write client disables MDT (#5186) YueZhang 2022-04-02 19:01:06 +08:00
  • b1e7e1f14e [HUDI-3708] Fix failure with HoodieMetadataRecord due to schema compatibility check (#5204) Y Ethan Guo 2022-04-01 20:17:02 -07:00
  • fb45fc9cb9 [HUDI-3773] Fix parallelism used for metadata table bloom filter index (#5209) Y Ethan Guo 2022-04-01 20:14:07 -07:00
  • 444ff496a4 [RFC-33] [HUDI-2429][Stacked on HUDI-2560] Support full Schema evolution for Spark (#4910) xiarixiaoyao 2022-04-02 04:20:24 +08:00
  • 9275b8fc7e [HUDI-3468][RFC-49] Support sync with DataHub (#5022) Raymond Xu 2022-04-01 12:27:01 -07:00
  • dfdd2de99c [HUDI-3225] [RFC-45] for async metadata indexing (#4640) Sagar Sumit 2022-04-02 00:19:23 +05:30
  • 7dfb168003 [HUDI-3763] Fixing hadoop conf class loading for inline reading (#5194) Sivabalan Narayanan 2022-04-01 08:27:40 -07:00
  • 23b31225df [HUDI-3769] Optimize the logs of HoodieMergeHandle and BufferedConnectWriter (#5200) 董可伦 2022-04-01 21:17:49 +08:00
  • 6df14f15a3 [HUDI-2752] The MOR DELETE block breaks the event time sequence of CDC (#4880) Danny Chan 2022-04-01 20:46:51 +08:00
  • 98b4e9796e [HUDI-3406] Rollback incorrectly relying on FS listing instead of Com… (#4957) ForwardXu 2022-04-01 10:01:41 +08:00
  • a048e940fd [HUDI-3743] Support DELETE_PARTITION for metadata table (#5169) Sagar Sumit 2022-04-01 06:59:17 +05:30
  • 28dafa774e [HUDI-2488][HUDI-3175] Implement async metadata indexing (#4693) Sagar Sumit 2022-04-01 01:33:12 +05:30
  • 1da196c1e8 [HUDI-2777] Improve HoodieSparkSqlWriter write performance (#5187) liuhe0702 2022-04-01 03:48:47 +08:00
  • 51a701cef1 [HUDI-3020] Utility to create manifest file (#5153) codejoyan 2022-03-31 19:52:03 +05:30
  • 7889c7852f [HUDI-3729][SPARK] fixed the per regression by enable vectorizeReader for parquet file (#5168) xiarixiaoyao 2022-03-31 20:09:26 +08:00
  • 73a21092f8 [HUDI-3732] Fixing rollback validation (#5157) Sivabalan Narayanan 2022-03-31 04:55:24 -07:00
  • 80011df995 [HUDI-3135] Make delete partitions lazy to be executed by the cleaner (#4489) ForwardXu 2022-03-31 15:35:39 +08:00
  • 3cdb590e15 [HUDI-3733] Adding HoodieFailedWritesCleaningPolicy for restore with hudi-cli (#5158) Sivabalan Narayanan 2022-03-31 00:30:49 -07:00
  • ce45f7f129 [HUDI-3692] MetadataFileSystemView includes compaction in timeline (#5110) Yuwei XIAO 2022-03-31 14:24:59 +08:00
  • 4569734d60 [HUDI-3713] Guarding archival for multi-writer (#5138) Sivabalan Narayanan 2022-03-30 22:44:31 -07:00
  • f6ff95f97c [MINOR][DOCS] Update hudi-utilities-slim-bundle docs (#5184) Y Ethan Guo 2022-03-30 21:48:54 -07:00
  • 2dbb273d26 [HUDI-3721] Delete MDT if necessary when trigger rollback to savepoint (#5173) YueZhang 2022-03-31 11:26:37 +08:00
  • 2c4554fada [HUDI-3750] Fix NPE when build HoodieFileIndex (#5134) KnightChess 2022-03-31 10:19:05 +08:00
  • d80c80699f [MINOR] Fixing flakiness in TestHoodieSparkMergeOnReadTableRollback.testRollbackWithDeltaAndCompactionCommit (#5183) Sivabalan Narayanan 2022-03-30 19:07:22 -07:00
  • 4fb1a590b1 [HUDI-3700] Add hudi-utilities-slim-bundle excluding hudi-spark-datasource modules (#5176) Y Ethan Guo 2022-03-30 18:08:35 -07:00
  • 9830005e9b [HUDI-3681] Provision additional hudi-spark-bundle with different versions (#5171) Y Ethan Guo 2022-03-30 17:35:56 -07:00
  • 2d73c8ae86 [HUDI-3355] Issue with out of order commits in the timeline when ingestion writers using SparkAllowUpdateStrategy (#4962) xiarixiaoyao 2022-03-31 06:54:25 +08:00
  • 9ff6a48f60 [HUDI-3736] Fix null pointer when key not specified (#5167) Nicolas Paris 2022-03-31 00:11:26 +02:00
  • 31d4a16deb [HUDI-3536] Add hudi-datahub-sync implementation (#5155) Raymond Xu 2022-03-30 14:38:02 -07:00
  • 17d11f4839 [MINOR] Repeated execution of update status (#5089) Bo Cui 2022-03-31 05:30:06 +08:00
  • 2b60641d17 [HUDI-3635] Fix HoodieMetadataTableValidator around comparison of partition path listing (#5100) YueZhang 2022-03-31 05:23:37 +08:00
  • eae8488536 [HUDI-3647] HoodieMetadataTableValidator: check MDT was initialized at first (#5152) YueZhang 2022-03-31 05:18:08 +08:00
  • 8b796e9686 [HUDI-3653] Cleaning up bespoke Column Stats Index implementation (#5062) Alexey Kudinkin 2022-03-30 10:01:43 -07:00
  • 04478a45d9 [MINOR] Fix dates as per UTC in TestDataSkippingUtils (#5166) Sagar Sumit 2022-03-30 20:03:14 +05:30
  • b9fbada2f2 [minor] Follow 3178, fix the flink metadata table compaction (#5175) Danny Chan 2022-03-30 20:45:29 +08:00
  • 7fa363923c [HUDI-3745] Support for spark datasource options in S3EventsHoodieIncrSource (#5170) harshal 2022-03-30 11:04:49 +05:30
  • 4fed8dd319 [HUDI-3485] Adding scheduler pool configs for async clustering (#5043) Sivabalan Narayanan 2022-03-29 18:27:45 -07:00
  • 5c1b482a1b [HUDI-3741] Fix flink bucket index bulk insert generates too many small files (#5164) Danny Chan 2022-03-30 08:18:36 +08:00
  • 941c254c33 [HUDI-2520] Fix CTAS statment issue when sync to hive (#5145) ForwardXu 2022-03-30 03:25:31 +08:00
  • e5a2baeed0 [HUDI-3549] Removing dependency on "spark-avro" (#4955) Alexey Kudinkin 2022-03-29 11:44:47 -07:00
  • 0802510ca9 [HUDI-2520] Fix drop partition issue when sync to hive (#5147) ForwardXu 2022-03-30 02:28:19 +08:00
  • fcb003ec76 [HUDI-3731] Fixing Column Stats Index record Merging sequence missing columnName (#5159) Alexey Kudinkin 2022-03-29 08:39:56 -07:00
  • 1b2fb71afc [MINOR] Move Experiemental to javadoc (#5161) Raymond Xu 2022-03-28 21:07:59 -07:00
  • 7c7ecb11d5 [HUDI-3736] Fix default dynamodblock url default value (#4967) Nicolas Paris 2022-03-29 05:31:46 +02:00
  • 8f8a8158e2 [HUDI-2520] Fix drop table issue when sync to Hive (#5143) leesf 2022-03-29 10:34:12 +08:00
  • 3bf9c5ffe8 [HUDI-3728] Set the sort operator parallelism for flink bucket bulk insert (#5154) Danny Chan 2022-03-29 09:52:35 +08:00
  • 72e0b52b18 [HUDI-3722] Fix truncate hudi table's error (#5140) ForwardXu 2022-03-29 09:44:18 +08:00
  • d074089c62 [HUDI-2566] Adding multi-writer test support to integ test (#5065) Sivabalan Narayanan 2022-03-28 14:05:00 -07:00
  • 6ccbae4d2a [HUDI-2757] Implement Hudi AWS Glue sync (#5076) Raymond Xu 2022-03-28 11:54:59 -07:00
  • 4ed84b216d [HUDI-3720] Fix the logic of reattempting pending rollback (#5148) Y Ethan Guo 2022-03-28 11:54:31 -07:00
  • 2e2d08cb72 [HUDI-3539] Flink bucket index bucketID bootstrap optimization. (#5093) Shawy Geng 2022-03-28 19:50:36 +08:00
  • 1d0f4ccfe0 [HUDI-3538] Support Compaction Command Based on Call Procedure Command for Spark SQL (#4945) huberylee 2022-03-28 14:11:35 +08:00
  • d31cde284c [MINOR] Fix call command parser use spark3.2 (#5144) ForwardXu 2022-03-28 11:13:44 +08:00
  • f2a93ead3b [HUDI-3724] Fixing closure of ParquetReader (#5141) Sivabalan Narayanan 2022-03-27 18:36:15 -07:00
  • 9da2dd416e [HUDI-3719] High performance costs of AvroSerizlizer in DataSource wr… (#5137) xiarixiaoyao 2022-03-28 02:01:43 +08:00
  • 85c4a6cfc1 [MINOR] Relaxing cleaner and archival configs (#5142) Sivabalan Narayanan 2022-03-27 09:26:24 -07:00
  • 484b3407e0 [HUDI-3604] Adjust the order of timeline changes in rollbacks (#5114) Y Ethan Guo 2022-03-26 22:37:44 -07:00
  • 4d940bbf8a [HUDI-3716] OOM occurred when use bulk_insert cow table with flink BUCKET index (#5135) Danny Chan 2022-03-27 09:13:58 +08:00
  • 189d5297b8 [HUDI-3709] Fixing ParquetWriter impls not respecting Parquet Max File Size limit (#5129) Alexey Kudinkin 2022-03-26 14:51:36 -07:00
  • 57b4f39c31 [HUDI-3612] Clustering strategy should create new TypedProperties when modifying it (#5027) RexAn 2022-03-26 18:46:03 +08:00
  • 0c09a973fb [HUDI-3435] Do not throw exception when instant to rollback does not exist in metadata table active timeline (#4821) Danny Chan 2022-03-26 11:42:54 +08:00
  • 51034fecf1 [HUDI-3396] Refactoring MergeOnReadRDD to avoid duplication, fetch only projected columns (#4888) Alexey Kudinkin 2022-03-25 09:32:03 -07:00
  • 12cc8e715b [MINOR] fix QuickstartUtils move (#5133) ForwardXu 2022-03-25 22:34:35 +08:00
  • e5c3f9089b [HUDI-3563] Make quickstart examples covered by CI tests (#5082) ForwardXu 2022-03-25 16:37:17 +08:00
  • f20c9867d7 [HUDI-3711] Fix typo in MaxwellJsonKafkaSourcePostProcessor.Config#PRECOMBINE_FIELD_TYPE_PROP (#5096) wangxianghu 2022-03-25 11:02:54 +04:00
  • 8b38ddedc2 [HUDI-3594] Supporting Composite Expressions over Data Table Columns in Data Skipping flow (#4996) Alexey Kudinkin 2022-03-24 22:27:15 -07:00
  • 8896864d7b [HUDI-3678] Fix record rewrite of create handle when 'preserveMetadata' is true (#5088) Danny Chan 2022-03-25 11:48:50 +08:00
  • 2fd9a4de5c [HUDI-3580] Claim RFC number 48 for LogCompaction action RFC (#5128) Surya Prasanna 2022-03-24 20:26:04 -07:00
  • 483ee843e6 [HUDI-3703] Reset taskID in restoreWriteMetadata (#5122) Zhaojing Yu 2022-03-25 10:18:28 +08:00