1
0

Commit Graph

  • de206acbae [HUDI-3369] New ScheduleAndExecute mode for HoodieCompactor and hudi-cli (#4750) YueZhang 2022-02-07 17:31:34 +08:00
  • 0880a8a5e4 [HUDI-3344] Standard format for HoodieDataSourceExample.scala (#4717) Qian.Sun 2022-02-07 11:27:44 +08:00
  • b8601a9f58 [HUDI-2656] Generalize HoodieIndex for flexible record data type (#3893) Y Ethan Guo 2022-02-03 20:24:04 -08:00
  • 69dfcda116 [HUDI-3191] Removing duplicating file-listing process w/in Hive's MOR FileInputFormats (#4556) Alexey Kudinkin 2022-02-03 14:01:41 -08:00
  • 5927bdd1c0 [HUDI-1295] Metadata Index - Bloom filter and Column stats index to speed up index lookups (#4352) Manoj Govindassamy 2022-02-03 04:42:48 -08:00
  • d681824982 [HUDI-3337] Fixing Parquet Column Range metadata extraction (#4705) Alexey Kudinkin 2022-02-02 17:58:05 -08:00
  • 819e8018ff [HUDI-3322][HUDI-3343] Fixing Metadata Table Records Duplication Issues (#4716) Alexey Kudinkin 2022-02-02 13:10:51 -08:00
  • a68e1dc2db [HUDI-431] Adding support for Parquet in MOR LogBlocks (#4333) Alexey Kudinkin 2022-02-02 11:35:05 -08:00
  • caef3d5c58 [HUDI-3330] Remove fixture test tables for multi writer tests (#4704) Raymond Xu 2022-02-02 04:20:10 -08:00
  • 72f7348830 [HUDI-2589] RFC-37: Metadata table based bloom index (#3989) Manoj Govindassamy 2022-02-01 15:38:20 -08:00
  • 16138db4f2 [HUDI-3368] Revert "[HUDI-3306] Upgrade rocksdb version (#4663)" (#4733) Sivabalan Narayanan 2022-02-01 14:18:38 -05:00
  • 4e61e5c9ea [HUDI-3293] Fixing default value for clustering small file config to 300MB (#4662) Sivabalan Narayanan 2022-02-01 08:22:37 -05:00
  • f140c58d9e [HUDI-3346] Fixing non existant marker dir handling in TwoToOnedowngrade (#4726) Sivabalan Narayanan 2022-02-01 08:21:55 -05:00
  • 7ce0f4522b [HUDI-2711] Fallback to fulltable scan for IncrementalRelation if underlying files have been cleared or moved by cleaner (#3946) jsbali 2022-02-01 09:33:18 +05:30
  • 4b388c104e [HUDI-3292] Enabling lazy read by default for log blocks during compaction (#4661) Sivabalan Narayanan 2022-01-31 22:36:17 -05:00
  • d3cfe07436 [HUDI-3318] [RFC-46] Optimize Record Payload handling (#4697) Alexey Kudinkin 2022-01-31 18:33:35 -08:00
  • ecbad9526a [HUDI-3253] preferred to use the table's own location (#4608) Yann Byron 2022-01-29 16:39:42 +08:00
  • ed7aa138e8 [MINOR] Added log to debug checkpoint resumption when set to 0 (#4650) Harsha Teja Kanna 2022-01-28 22:08:25 -06:00
  • c0e8b03d93 [HUDI-1977] Fix Hudi CLI tempview query issue (#4626) peanut-chenzhong 2022-01-29 10:39:08 +08:00
  • e78b2f1b55 [HUDI-2943] Complete pending clustering before deltastreamer sync (#4572) Sagar Sumit 2022-01-29 07:58:04 +05:30
  • 2b52a56981 [HUDI-2688][RFC-40] A new Hudi connector for Trino (#3957) Sagar Sumit 2022-01-28 19:13:11 +05:30
  • 0bd38f26ca [HUDI-2596] Make class names consistent in hudi-client (#4680) Raymond Xu 2022-01-27 17:05:08 -08:00
  • 4a9f826382 [HUDI-3215] Solve UT for Spark 3.2 (#4565) Yann Byron 2022-01-27 06:48:26 +08:00
  • 3f21e5f14c [MINOR] Fixing serializability of SerializableHoodieRollbackRequest (#4688) Sivabalan Narayanan 2022-01-26 16:45:35 -05:00
  • f87c47352a [HUDI-2763] Metadata table records - support for key deduplication based on hardcoded key field (#4449) Manoj Govindassamy 2022-01-26 10:34:04 -08:00
  • dd4ce1bdfd [HUDI-3328] Updating doap file for release 0.10.1 (#4689) Sivabalan Narayanan 2022-01-26 08:45:57 -05:00
  • 9363804b1d [MINOR] Fixing serializability with ListingBasedRollbackRequest (#4655) Sivabalan Narayanan 2022-01-25 19:35:37 -05:00
  • 78e6ab0e67 [HUDI-3217] Claim the number for RFC-46 (#4687) Alexey Kudinkin 2022-01-25 14:58:34 -08:00
  • 920f45926a [HUDI-1822] Rewriting rfc-27 for data skipping index (#4280) Sivabalan Narayanan 2022-01-25 00:27:59 -05:00
  • bf409e8423 [MINOR] Standardize HoodieSqlCommon.g4 file (#4582) xuzifu666 2022-01-25 10:09:08 +08:00
  • 26c3f797b0 [HUDI-3237] gracefully fail to change column data type (#4677) Yann Byron 2022-01-25 08:33:36 +08:00
  • bc7882cbe9 [HUDI-2872][HUDI-2646] Refactoring layout optimization (clustering) flow to support linear ordering (#4606) Alexey Kudinkin 2022-01-24 13:53:54 -08:00
  • 6f10107998 [HUDI-3306] Upgrade rocksdb version (#4663) Satyam Raj 2022-01-25 01:23:20 +05:30
  • 1f7b6b2154 [HUDI-2417] Add support allowDuplicateInserts in HoodieJavaClient (#3644) 董可伦 2022-01-25 03:26:27 +08:00
  • 87db4ded42 [MINOR] Add default value as null for S3 Incremental source properties (#4674) Vinish Reddy 2022-01-25 00:54:43 +05:30
  • 7bd389fb47 [MINOR] typo fix in BaseTableMetadata wrt spurious deletes handling (#4673) YueZhang 2022-01-24 20:09:54 +08:00
  • e00a9042e9 [HUDI-3072] Fixing conflict resolution in transaction management code path for auto commit code path (#4588) Sivabalan Narayanan 2022-01-24 05:43:28 -05:00
  • cfde45b548 [HUDI-3282] Fix delete exception for Spark SQL when sync Hive (#4644) 董可伦 2022-01-24 03:32:57 +08:00
  • f7a77961e3 [HUDI-1850][HUDI-3234] Fixing read of a empty table but with failed write (#2903) Sivabalan Narayanan 2022-01-23 14:23:21 -05:00
  • e72553accf [HUDI-3262] Fixing utilities and integ test suite bundle to include hudi spark datasource (#4670) Sivabalan Narayanan 2022-01-23 08:46:37 -05:00
  • 56cd8ffae0 [HUDI-2837] Add support for using database name in incremental query (#4083) 董可伦 2022-01-23 14:11:27 +08:00
  • 64b1426005 [minor] Fix hive-exec scope of flink bundle jar (#4664) Danny Chan 2022-01-23 10:28:41 +08:00
  • 4b9085057a [HUDI-3268] Fix NPE while reading table with Spark datasource (#4630) Y Ethan Guo 2022-01-21 05:46:07 -08:00
  • 8547f11752 [HUDI-3271] Code optimization and clean up unused code in HoodieSparkSqlWriter (#4631) 董可伦 2022-01-21 07:49:04 +08:00
  • 79bf6ab00b [HUDI-3281][Performance]Tuning performance of getAllPartitionPaths API in FileSystemBackedTableMetadata (#4643) YueZhang 2022-01-21 07:47:02 +08:00
  • 2071e3bfda [HUDI-3250] Upgrade Presto docker image (#4646) Sagar Sumit 2022-01-20 23:00:25 +05:30
  • a66004a340 [HUDI-3285] Drop unused method SparkBootstrapCommitActionExecutor#handleMetadataBootstrap (#4653) wangxianghu 2022-01-20 20:04:36 +04:00
  • 14d08bb64c [MINOR] Fix typo in the doc of BULK_INSERT_SORT_MODE (#4652) wangxianghu 2022-01-20 15:34:56 +04:00
  • b7a79aa943 [HUDI-3283] Bootstrap support overwrite existing table (#4647) wangxianghu 2022-01-20 14:42:52 +04:00
  • 31b57a256f [HUDI-3236] use fields'comments persisted in catalog to fill in schema (#4587) Yann Byron 2022-01-20 13:44:35 +08:00
  • a08a2b7306 [MINOR] Add instructions to build and upload Docker Demo images (#4612) Y Ethan Guo 2022-01-19 20:25:28 -08:00
  • db93ad2f4b [HUDI-3277] Filter non-parquet files in bootstrap procedure (#4639) wangxianghu 2022-01-19 21:13:51 +04:00
  • 7647562dad [HUDI-2833][Design] Merge small archive files instead of expanding indefinitely. (#4078) YueZhang 2022-01-19 14:42:35 +08:00
  • 4bea758738 [HUDI-3191] Rebasing Hive's FileInputFormat onto AbstractHoodieTableFileIndex (#4531) Alexey Kudinkin 2022-01-18 14:54:51 -08:00
  • caeea946fb [HUDI-3245] Convert uppercase letters to lowercase in storage configs (#4602) Thinking Chen 2022-01-19 03:51:09 +08:00
  • a09c231911 [HUDI-2903] get table schema from the last commit with data written (#4180) Yann Byron 2022-01-18 23:50:30 +08:00
  • 45f054ffde [HUDI-3263] Do not nullify members in HoodieTableFileSystemView#resetViewState to avoid NPE (#4625) Danny Chan 2022-01-18 17:46:40 +08:00
  • 3b56320bd8 [HUDI-3261] Read rt table by hive cli throw NoSuchMethodError (#4624) EchoLee5 2022-01-18 16:58:08 +08:00
  • 3d93e857cc [MINOR] Minor improvement in JsonkafkaSource (#4620) wangxianghu 2022-01-18 11:13:05 +04:00
  • f18447406d [HUDI-1558] Struct Stream Source Support Spark3 (#4586) RexAn 2022-01-18 11:08:33 +08:00
  • 20e7983866 [HUDI-3252] Avoid creating empty requestedReplaceCommit in the startCommit method (#4515) 董可伦 2022-01-18 06:28:18 +08:00
  • d36533735f [HUDI-3194] fix MOR snapshot query during compaction (#4540) Yuwei XIAO 2022-01-18 06:24:24 +08:00
  • 36a9f63e45 [HUDI-3257] Excluding clustering instants from pending rollback info (#4616) Danny Chan 2022-01-17 18:18:45 +08:00
  • 75caa7d3d8 [HUDI-3179] Extracted common AbstractHoodieTableFileIndex to be shared across engines (#4520) Alexey Kudinkin 2022-01-16 22:46:20 -08:00
  • ed92c217ed [MINOR] Delete unused parameter in TablePathUtils (#4595) xiaotianzhang01 2022-01-17 14:24:43 +08:00
  • d2dda55794 [HUDI-2968] add UT for update/delete on non-pk condition (#4568) Yann Byron 2022-01-17 04:02:12 +08:00
  • 28b3b6ad8f [MINOR] Remove org.apache.directory.api.util.Strings import (#4601) 0x574C 2022-01-16 16:58:18 +08:00
  • 822230d9ea [MINOR] Optimize variable names and logs (#4581) 董可伦 2022-01-16 16:09:22 +08:00
  • 5e0171a5ee [HUDI-3198] Improve Spark SQL create table from existing hudi table (#4584) Yann Byron 2022-01-15 02:15:29 +08:00
  • 53f75f84b8 [HUDI-2785] Add Trino setup in Docker Demo (#4300) Y Ethan Guo 2022-01-14 08:38:55 -08:00
  • 7d163ee3de [MINOR] Fix local flaky test in TestFSUtils (#4596) Y Ethan Guo 2022-01-13 22:48:57 -08:00
  • 5ce45c440b [HUDI-3172] Refactor hudi existing modules to make more code reuse in V2 Implementation (#4514) leesf 2022-01-14 13:42:35 +08:00
  • 195dac90fa [MINOR] Disable flaky tests to unlock CI (#4592) Sagar Sumit 2022-01-14 09:13:27 +05:30
  • 209f91cb33 [HUDI-3010] Unbundle parquet-avro and shade other dependencies in prsto bundle (#4551) Sagar Sumit 2022-01-13 09:30:24 +05:30
  • 397795c7d0 [HUDI-3007] Fix issues in HoodieRepairTool (#4564) Y Ethan Guo 2022-01-12 09:03:27 -08:00
  • 12e95771ee [HUDI-3235] Fix ClassNotFoundException due to log4j-core dependency (#4574) Sagar Sumit 2022-01-12 22:23:43 +05:30
  • 8a40d95506 [HUDI-3225] Claim RFC-45 for async metadata indexing (#4569) Sagar Sumit 2022-01-12 22:23:01 +05:30
  • 2969fb3835 [HUDI-3233] Make metadata commit synchronous for flink batch todd5167 2022-01-12 13:34:09 +08:00
  • 9fe28e56b4 [HUDI-3045] New clustering regex match config to choose partitions when building clustering plan (#4346) YueZhang 2022-01-12 15:23:55 +08:00
  • 017ddbbfac [MINOR] Fix typos (#4567) 董可伦 2022-01-12 15:17:10 +08:00
  • 4b0111974f [HUDI-3184] hudi-flink support timestamp-micros (#4548) Town 2022-01-11 20:53:51 -06:00
  • a392e9ba46 [HUDI-485] Corrected the check for incremental sql (#2768) Pratyaksh Sharma 2022-01-12 08:22:07 +05:30
  • 6cdcd89afa [HUDI-3094] Unify Hive's InputFormat implementations to avoid duplication (#4417) Alexey Kudinkin 2022-01-11 15:02:13 -08:00
  • 4b2fd37fb4 [MINOR] Remove unused static var in HoodieAvroWriteSupport (#4543) xuzifu666 2022-01-12 03:53:45 +08:00
  • c9bc626299 [HUDI-3211] Claim RFC number for RFC for Hudi Connector for Presto (#4562) Todd Gao 2022-01-11 16:38:27 +08:00
  • f74cd57320 [HUDI-3195] Fix spark 3 pom (#4554) Raymond Xu 2022-01-10 19:11:22 -08:00
  • 67ad4992e1 Removing extraneous warn logs in ClusteringUtils (#4553) Sivabalan Narayanan 2022-01-10 21:50:14 -05:00
  • f1e3762a94 [HUDI-2950] Addressing performance traps in Bulk Insert/Layout Optimization (#4234) Alexey Kudinkin 2022-01-10 18:23:22 -08:00
  • c8df9b09d7 [HUDI-3148] Create pushgateway client based on port (#4497) t0il3ts0ap 2022-01-11 04:39:47 +05:30
  • f230eca9b5 [MINOR] Fix port number in setupKafka.sh (#4546) Y Ethan Guo 2022-01-10 13:07:52 -08:00
  • 7a8b94c82d [HUDI-3180] Include files from completed commits while bootstrapping metadata table (#4519) Sivabalan Narayanan 2022-01-10 15:33:15 -05:00
  • bc95571caa [HUDI-2735] Allow empty commits in Kafka Connect Sink for Hudi (#4544) Y Ethan Guo 2022-01-10 12:31:25 -08:00
  • 251d4eb3b6 [HUDI-3030] InProcessLockPovider as default when any async servcies enabled with no lock provider override (#4406) Manoj Govindassamy 2022-01-09 19:10:24 -08:00
  • 56f93f4ebd Removing rollbacks instants from timeline for restore operation (#4518) Sivabalan Narayanan 2022-01-09 21:14:28 -05:00
  • e9a7f49f55 [HUDI-3112] Fix KafkaConnect cannot sync to Hive Problem (#4458) Thinking Chen 2022-01-10 07:31:57 +08:00
  • 604d9885f1 [HUDI-3009] making some fixes to S3 incremental source (#4517) Sivabalan Narayanan 2022-01-09 12:46:52 -05:00
  • 977d3c6dad [HUDI-3157] Remove aws jars from hudi bundles (#4542) RexAn 2022-01-09 18:23:46 +08:00
  • cf362fb2d5 [MINOR] Fix some code style issues based on check-style plugin (#4532) YueZhang 2022-01-09 17:14:56 +08:00
  • 36790709f7 [HUDI-3125] spark-sql write timestamp directly (#4471) Yann Byron 2022-01-09 15:43:25 +08:00
  • 0d8ca8da4e [HUDI-3104] Kafka-connect support of hadoop config environments and properties (#4451) Thinking Chen 2022-01-09 15:10:17 +08:00