1
0

Commit Graph

  • eaa4c4f2e2 [HUDI-1180] Upgrade HBase to 2.4.9 (#5004) Y Ethan Guo 2022-03-24 19:04:53 -07:00
  • 5e86cdd1e9 [HUDI-3701] Flink bulk_insert support bucket hash index (#5118) Danny Chan 2022-03-25 09:01:42 +08:00
  • 608d4bf32d [HUDI-3638] Make ZookeeperBasedLockProvider serializable (#5112) Y Ethan Guo 2022-03-24 17:59:47 -07:00
  • 9b3dd2e0b7 [HUDI-3624] Check all instants before starting a commit in metadata table (#5098) Y Ethan Guo 2022-03-24 17:13:58 -07:00
  • 4ddd094ba2 [HUDI-3689] Disable flaky tests in TestHoodieDeltaStreamer (#5127) Y Ethan Guo 2022-03-24 16:42:44 -07:00
  • ff136658a0 [HUDI-3689] Fix delta streamer tests (#5124) Raymond Xu 2022-03-24 14:19:53 -07:00
  • 44ab3b73ed [HUDI-3706] Downgrade maven surefire and failsafe version (#5123) Y Ethan Guo 2022-03-24 09:31:46 -07:00
  • 686da41696 [HUDI-3689] Fix UT failures in TestHoodieDeltaStreamer (#5120) Raymond Xu 2022-03-24 09:10:33 -07:00
  • b14706502b [HUDI-3689] Remove Azure CI cache (#5121) Raymond Xu 2022-03-24 05:39:11 -07:00
  • ccc3728002 [HUDI-3684] Fixing NPE in ParquetUtils (#5102) Alexey Kudinkin 2022-03-24 05:07:38 -07:00
  • fe2c3989e3 [HUDI-3689] Fix glob path and hive sync in deltastreamer tests (#5117) Sagar Sumit 2022-03-24 15:48:35 +05:30
  • a1c42fcc07 [minor] Checks the data block type for archived timeline (#5106) Danny Chan 2022-03-24 14:10:43 +08:00
  • 52f0498330 Fixing non partitioned all files record in MDT (#5108) Sivabalan Narayanan 2022-03-23 19:26:39 -07:00
  • f96ba7abf0 [HUDI-3642] Handle NPE due to empty requested replacecommit metadata (#5090) Sagar Sumit 2022-03-24 00:43:02 +05:30
  • 5f570ea151 [HUDI-2883] Refactor hive sync tool / config to use reflection and standardize configs (#4175) Rajesh Mahindra 2022-03-21 19:56:31 -07:00
  • 9b6e138af2 [HUDI-3640] Set SimpleKeyGenerator as default in 2to3 table upgrade for Spark engine (#5075) Y Ethan Guo 2022-03-21 17:35:06 -07:00
  • ca0931d332 [HUDI-1436]: Provide an option to trigger clean every nth commit (#4385) Pratyaksh Sharma 2022-03-22 05:36:30 +05:30
  • 26e5d2e6fc [HUDI-3559] Flink bucket index with COW table throws NoSuchElementException wxp4532 2022-03-11 14:07:52 +08:00
  • a118d56b07 [MINOR] Fixing sparkUpdateNode for record generation (#5079) Sivabalan Narayanan 2022-03-20 21:56:30 -07:00
  • 799c78e688 [HUDI-3665] Support flink multiple versions (#5072) Danny Chan 2022-03-21 10:34:50 +08:00
  • 15d1c18625 [MINOR] Remove flaky assert in TestInLineFileSystem (#5069) Y Ethan Guo 2022-03-20 15:58:30 -07:00
  • 1b6e201160 [HUDI-3663] Fixing Column Stats index to properly handle first Data Table commit (#5070) Alexey Kudinkin 2022-03-19 21:54:13 -07:00
  • 099c2c099a [HUDI-3457] Refactored Spark DataSource Relations to avoid code duplication (#4877) Alexey Kudinkin 2022-03-18 22:32:16 -07:00
  • 316e38c71e [HUDI-3659] Reducing the validation frequency with integ tests (#5067) Sivabalan Narayanan 2022-03-18 09:45:33 -07:00
  • 2551c26183 [HUDI-3656] Adding medium sized dataset for clustering and minor fixes to integ tests (#5063) Sivabalan Narayanan 2022-03-18 09:44:56 -07:00
  • 6fe4d6e2f6 [HUDI-3598] Row Data to Hoodie Record Operator parallelism needs to always be consistent with input operator (#5049) JerryYue-M 2022-03-18 10:47:29 +08:00
  • 9ece77561a [MINOR] HoodieFileScanRDD could print null path (#5056) RexAn 2022-03-18 03:53:45 +08:00
  • 7446ff95a7 [HUDI-2439] Replace RDD with HoodieData in HoodieSparkTable and commit executors (#4856) Raymond Xu 2022-03-17 19:17:56 +08:00
  • bf191f8d46 [HUDI-3645] Fix NPE caused by multiple threads accessing non-thread-safe HashMap (#5028) 冯健 2022-03-17 16:50:28 +08:00
  • 5ba2d9ab2f [HUDI-3494] Consider triggering condition of MOR compaction during archival (#4974) Y Ethan Guo 2022-03-16 22:28:11 -07:00
  • 95e6e53810 [HUDI-3404] Automatically adjust write configs based on metadata table and write concurrency mode (#4975) Y Ethan Guo 2022-03-16 22:25:04 -07:00
  • 8ca9a54db0 [Hudi-3376] Add an option to skip under deletion files for HoodieMetadataTableValidator (#4994) YueZhang 2022-03-17 09:31:00 +08:00
  • 91849c3d66 [HUDI-3607] Support backend switch in HoodieFlinkStreamer (#5032) that's cool 2022-03-16 14:07:31 +08:00
  • 296a0e6bcf [HUDI-3588] Remove hudi-common and hudi-hadoop-mr jars in Presto Docker image (#4997) Y Ethan Guo 2022-03-15 18:49:30 -07:00
  • 55dca969f9 [HUDI-3589] flink sync hive metadata supports table properties and serde properties (#4995) todd5167 2022-03-16 03:56:37 +08:00
  • d514570e90 [HUDI-3633] Allow non-string values to be set in TypedProperties (#5045) Sagar Sumit 2022-03-16 00:03:22 +05:30
  • 5e8ff8d793 [HUDI-3514] Rebase Data Skipping flow to rely on MT Column Stats index (#4948) Alexey Kudinkin 2022-03-15 10:38:36 -07:00
  • 9bdda2a312 [HUDI-3619] Fix HoodieOperation fromValue using wrong constant value (#5033) l-shen 2022-03-15 20:34:31 +08:00
  • 6ed7106e59 [HUDI-3606] Add org.objenesis:objenesis to hudi-timeline-server-bundle pom (#5017) Thinking Chen 2022-03-15 19:06:50 +08:00
  • 3b59b76952 [HUDI-3547] Introduce MaxwellSourcePostProcessor to extract data from Maxwell json string (#4987) wangxianghu 2022-03-15 15:06:30 +04:00
  • d40adfa2d7 [HUDI-3620] Adding spark3.2.0 profile (#5038) Sivabalan Narayanan 2022-03-14 16:14:00 -07:00
  • 30cf39301e [HUDI-3623] Removing hive sync node from non hive yamls (#5040) Sivabalan Narayanan 2022-03-14 15:39:26 -07:00
  • 22c3ce73db [HUDI-3621] Fixing NullPointerException in DeltaStreamer (#5039) Sivabalan Narayanan 2022-03-14 15:34:17 -07:00
  • 003c6ee73e [MINODR] Remove repeated kafka-clients dependencies (#5034) wangxianghu 2022-03-14 18:24:06 +04:00
  • 4b75cb6f23 fix NPE when run schdule using spark-sql if the commits time < hoodie.compact.inline.max.delta.commits (#4976) peanut-chenzhong 2022-03-14 16:40:38 +08:00
  • 465d553df8 [HUDI-3600] Tweak the default cleaning strategy to be more streaming friendly for flink (#5010) Danny Chan 2022-03-14 14:22:07 +08:00
  • 1ba8220617 [HUDI-3613] Adding/fixing yamls for metadata (#5029) Sivabalan Narayanan 2022-03-13 18:11:37 -07:00
  • 6c8224cae6 [HUDI-3501] Support savepoints command based on Call Produce Command (#5025) ForwardXu 2022-03-13 20:58:21 +08:00
  • e60acc1258 [HUDI-3583] Fix MarkerBasedRollbackStrategy NoSuchElementException (#4984) liujinhui 2022-03-13 15:00:50 +08:00
  • eee96e9af3 [HUDI-3593] Restore TypedProperties and flush checksum in table config (#5013) Sagar Sumit 2022-03-13 07:58:55 +05:30
  • e7bb0413af [HUDI-3556] Re-use rollback instant for rolling back of clustering and compaction if rollback failed mid-way (#4971) Sivabalan Narayanan 2022-03-11 15:40:13 -08:00
  • e8918b6c2c [HUDI-3569] Introduce ChainedJsonKafkaSourePostProcessor to support setting multi processors at once (#4969) wangxianghu 2022-03-12 02:49:30 +04:00
  • 93277b2bcd [HUDI-3592] Fix NPE of DefaultHoodieRecordPayload if Property is empty (#4999) RexAn 2022-03-12 06:45:40 +08:00
  • 5d59bf67ae [HUDI-3513] Make sure Column Stats does not fail in case it fails to load previous Index Table state (#5015) Alexey Kudinkin 2022-03-11 14:39:22 -08:00
  • 56cb49485d [HUDI-3567] Refactor HoodieCommonUtils to make code more reasonable (#4982) huberylee 2022-03-12 05:23:19 +08:00
  • b00180342e [HUDI-3575] Use HoodieTestDataGenerator#TRIP_SCHEMA as example schema in TestSchemaPostProcessor (#5019) wangxianghu 2022-03-11 15:03:42 +04:00
  • faed6996ee [HUDI-3566] Add thread factory in BoundedInMemoryExecutor (#4926) 苏承祥 2022-03-11 18:58:49 +08:00
  • 18cdad9206 [HUDI-2999] [RFC-42] RFC for consistent hashing index (#4326) Yuwei XIAO 2022-03-11 14:41:01 +08:00
  • 83cff3afee [HUDI-3522] Introduce DropColumnSchemaPostProcessor to support drop columns from schema (#4972) wangxianghu 2022-03-11 09:30:37 +04:00
  • 9dc6df5dca [HUDI-3595] Fixing NULL schema provider for empty batch (#5002) Sivabalan Narayanan 2022-03-10 19:52:55 -08:00
  • fa5e75068e [HUDI-3586] Add Trino Queries in integration tests (#4988) Y Ethan Guo 2022-03-10 18:17:32 -08:00
  • 4e09545be4 [HUDI-3602][DOCS] Update docker README to build multi-arch images using buildx (#5011) Sagar Sumit 2022-03-10 16:08:27 +05:30
  • ec24407191 [HUDI-3581] Reorganize some clazz for hudi flink (#4983) Danny Chan 2022-03-10 15:55:15 +08:00
  • 034addaef5 [HUDI-3396] Make sure BaseFileOnlyViewRelation only reads projected columns (#4818) Alexey Kudinkin 2022-03-09 18:45:25 -08:00
  • ca0b8fccee [MINOR] Add IT CI Test timeout option (#5003) ForwardXu 2022-03-10 10:04:36 +08:00
  • 8859b48b2a [HUDI-3383] Sync column comments while syncing a hive table (#4960) MrSleeping123 2022-03-10 09:44:39 +08:00
  • 548000b0d6 [HUDI-3568] Introduce ChainedSchemaPostProcessor to support setting multi processors at once (#4968) wangxianghu 2022-03-09 11:16:22 +04:00
  • 4324e874ae [HUDI-3587] Making SupportsUpgradeDowngrade serializable (#4991) Sivabalan Narayanan 2022-03-08 21:04:42 -08:00
  • 08fd80c913 [HUDI-3221] Support querying a table as of a savepoint (#4720) ForwardXu 2022-03-09 02:02:34 +08:00
  • 575bc63468 [HUDI-3356][HUDI-3203] HoodieData for metadata index records; BloomFilter construction from index based on the type param (#4848) Sagar Sumit 2022-03-08 21:09:04 +05:30
  • ed26c5265c [HUDI-3584] Skip integ test modules by default (#4986) Raymond Xu 2022-03-08 06:32:04 -08:00
  • 25385805aa [HUDI-3574] Improve maven module configs for different spark profiles (#4970) ForwardXu 2022-03-08 17:01:05 +08:00
  • fe53bd2dea [HUDI-2677] Add DFS based message queue for flink writer[part3] (#4961) Danny Chan 2022-03-08 15:43:21 +08:00
  • b6bdb46f7f [MINOR][HUDI-3460]Fix HoodieDataSourceITCase Bo 2022-03-06 10:55:04 +08:00
  • 34bc752853 [HUDI-3573] flink cleanFuntion execute clean on initialization (#4936) todd5167 2022-03-08 11:53:54 +08:00
  • 29040762fa [HUDI-3576] Configuring timeline refreshes based on latest commit (#4973) Sivabalan Narayanan 2022-03-07 17:01:49 -05:00
  • 53826d69e4 [HUDI-2747] support set --sparkMaster for MDT cli (#4964) YueZhang 2022-03-08 05:57:03 +08:00
  • a66fd40692 [HUDI-3365] Make sure Metadata Table records are updated appropriately on HDFS (#4739) Alexey Kudinkin 2022-03-07 12:38:27 -08:00
  • f0bcee3c01 [HUDI-3561] Avoid including whole MultipleSparkJobExecutionStrategy object into the closure for Spark to serialize (#4954) Alexey Kudinkin 2022-03-07 10:42:03 -08:00
  • 3539578ccb [HUDI-3213] Making commit preserve metadata to true for compaction (#4811) Sivabalan Narayanan 2022-03-07 07:32:05 -05:00
  • 6f57bbfac4 [HUDI-3069] Improve HoodieMergedLogRecordScanner avoid putting unnecessary hoodie records (#4932) 苏承祥 2022-03-07 14:35:55 +08:00
  • c9ffdc493e [HUDI-3525] Introduce JsonkafkaSourceProcessor to support data preprocess before it is transformed to DataSet (#4930) wangxianghu 2022-03-07 00:41:01 +04:00
  • 4b471772aa [HUDI-3520] Introduce DeleteSupportSchemaPostProcessor to support adding _hoodie_is_deleted column to schema (#4921) wangxianghu 2022-03-07 00:37:09 +04:00
  • 051ad0b033 [HUDI-3130] Fixing Hive getSchema for RT tables addressing different partitions having different schemas (#4468) Aditya Tiwari 2022-03-06 07:51:35 +05:30
  • 6a46130037 [HUDI-2761] Fixing timeline server for repeated refreshes (#4812) Sivabalan Narayanan 2022-03-04 21:04:16 -05:00
  • 0986d5a01d [HUDI-3460] Add reader merge memory option for flink (#4911) Bo Cui 2022-03-04 19:29:29 +08:00
  • b4362fac45 [HUDI-3348] Add UT to verify HoodieRealtimeFileSplit serde (#4951) Raymond Xu 2022-03-03 23:19:16 -08:00
  • f449807630 [MINOR] fix UTC timezone config (#4950) Yuwei XIAO 2022-03-04 15:09:39 +08:00
  • 6faed3d90a [HUDI-3161][RFC-47] Add Call Produce Command for Spark SQL (#4607) ForwardXu 2022-03-04 12:02:46 +08:00
  • 62f534d002 [HUDI-3445] Support Clustering Command Based on Call Procedure Command for Spark SQL (#4901) shibei 2022-03-04 09:33:16 +08:00
  • be9a264885 [HUDI-3548] Fix if user specify key "hoodie.datasource.clustering.async.enable" directly, async clustering not work (#4905) RexAn 2022-03-04 08:14:07 +08:00
  • a4ba0fff07 [HUDI-3552] Strength the NetworkUtils#getHostname by checking network interfaces first (#4942) Danny Chan 2022-03-03 21:11:08 +08:00
  • 876a891979 [HUDI-3544] Fixing "populate meta fields" update to metadata table (#4941) Sivabalan Narayanan 2022-03-03 06:32:25 -05:00
  • 51ee5005a6 [HUDI-2973] RFC-27: Data skipping index to improve query performance (#4728) Manoj Govindassamy 2022-03-03 02:26:22 -08:00
  • 907e60c252 [HUDI-3264]: made schema registry urls configurable with MTDS (#4779) Pratyaksh Sharma 2022-03-03 02:00:41 +05:30
  • 527bd34b1c [MINOR] RFC-38 markdown content error (#4933) liujinhui 2022-03-02 23:40:28 +08:00
  • f8945eca08 [MINOR] Adding more test props to integ tests (#4935) Sivabalan Narayanan 2022-03-02 08:10:43 -05:00
  • 1d57bd17c2 [minor] Cosmetic changes following HUDI-3315 (#4934) Danny Chan 2022-03-02 17:44:52 +08:00
  • 10d866f083 [HUDI-3315] RFC-35 Part-1 Support bucket index in Flink writer (#4679) Gary Li 2022-03-02 15:14:44 +08:00
  • 85f47b53df [HUDI-3469] Refactor HoodieTestDataGenerator to provide for reproducible Builds (#4866) Alexey Kudinkin 2022-03-01 22:15:26 -08:00