1
0

Commit Graph

  • d9581682a2 Add option to control use hsync or not 0.13.0 v-zhangjc9 2024-04-24 15:33:03 +08:00
  • 181df2240a Fix bug for schedule compaction manually v-zhangjc9 2023-10-30 15:27:58 +08:00
  • 2188b8ed8a Use hoodie table path to be uid avoid that the same name cannot be start in one job v-zhangjc9 2022-07-01 09:48:06 +08:00
  • 6be03ca56a Down the reader mem check v-zhangjc9 2022-06-30 19:13:51 +08:00
  • 215a794fd3 Add victoria metrics reporter v-zhangjc9 2022-06-30 17:50:53 +08:00
  • eb4b741c38 If there are multiple files under the same partition path and file ID, sort them according to the modification time of the files to avoid reading the files that failed to write before. jcxiaozf 2022-05-17 16:05:45 +08:00
  • 5c4908f006 Add closed handler to HoodieFlinkCompactor v-zhangjc9 2022-05-15 15:45:29 +08:00
  • 0ac43017cb Fix NPE when offline compaction could not find schema from data file v-zhangjc9 2022-04-13 09:51:10 +08:00
  • 32f7e323dc Change version to private v-zhangjc9 2022-04-11 15:35:04 +08:00
  • 8462d79ead Change hadoop version to 3.1.2 v-zhangjc9 2022-04-11 15:10:58 +08:00
  • 46ce96096d Add private repo v-zhangjc9 2022-04-11 15:08:53 +08:00
  • 765dd2eae6 [HUDI-4221] Optimzing getAllPartitionPaths (#6234) Sivabalan Narayanan 2022-07-29 03:49:56 -04:00
  • ce4330d62b [HUDI-4499] Tweak default retry times for flink metadata table lock (#6238) Danny Chan 2022-07-29 15:01:29 +08:00
  • c39e88dcf0 [HUDI-4495] Fix handling of S3 paths incompatible with java URI standards (#6237) Udit Mehrotra 2022-07-28 20:04:14 -07:00
  • cfd0c1ee34 [HUDI-4081][HUDI-4472] Addressing Spark SQL vs Spark DS performance gap (#6213) Alexey Kudinkin 2022-07-28 15:36:03 -07:00
  • 70b5cf6dab [MINOR] Minor changes around Spark 3.3 support (#6231) Shawn Chang 2022-07-28 09:32:34 -07:00
  • ea1fbc71ec [HUDI-4494] keep the fields' order when data is written out of order (#6233) Yann Byron 2022-07-28 22:15:01 +08:00
  • 07eedd3ef6 [HUDI-4484] Add default lock config options for flink metadata table (#6222) Danny Chan 2022-07-28 20:57:13 +08:00
  • 0a5ce000bf [HUDI-4490] Make AWSDmsAvroPayload class backwards compatible (#6229) Rahil C 2022-07-27 19:55:06 -07:00
  • 51599af281 [HUDI-4126] Disable file splits for Bootstrap real time queries (via InputFormat) (#6219) Rahil C 2022-07-27 14:58:29 -07:00
  • cdaec5a8da [HUDI-4186] Support Hudi with Spark 3.3.0 (#5943) Shawn Chang 2022-07-27 14:47:49 -07:00
  • 924c30c7ea [HUDI-4469] Flip reuse flag to true in HoodieBackedTableMetadata to improve file listing (#6214) Y Ethan Guo 2022-07-27 14:04:59 -07:00
  • 717f159bfd [HUDI-3730] Keep metasync configs backward compatible (#6221) Shiyan Xu 2022-07-27 05:30:44 -05:00
  • e5faf2cc84 [HUDI-4210] Create custom hbase index to solve data skew issue on hbase regions (#5797) 冯健 2022-07-26 18:09:17 +08:00
  • 1ea1e659c2 [HUDI-4474] Infer metasync configs (#6217) Shiyan Xu 2022-07-26 04:58:31 -05:00
  • 74d7b4d751 [HUDI-4471] Relocate AWSDmsAvroPayload class to hudi-common Dongwook Kwon 2021-03-10 17:42:09 -08:00
  • e7c8df7e8b [HUDI-4250][HUDI-4202] Optimize performance of Column Stats Index reading in Data Skipping (#5746) Alexey Kudinkin 2022-07-25 15:36:12 -07:00
  • 6e7ac45735 [HUDI-3884] Support archival beyond savepoint commits (#5837) Sagar Sumit 2022-07-26 00:12:29 +05:30
  • eee6a02f77 [HUDI-4456] Clean up test resources (#6203) Shiyan Xu 2022-07-25 10:13:06 -05:00
  • 71c2c3102b [HUDI-4455] Improve test classes for TestHiveSyncTool (#6202) Shiyan Xu 2022-07-25 08:35:34 -05:00
  • 1fda9ee9bb [HUDI-4071] Match ROLLBACK_USING_MARKERS_ENABLE in sql as datasource (#6206) superche 2022-07-25 18:40:23 +08:00
  • b513232449 [HUDI-4458] Add a converter cache for flink ColumnStatsIndices (#6205) Danny Chan 2022-07-25 17:49:01 +08:00
  • f6e7227ed5 [MINOR] Only log stdout output for non-zero exit from commands in IT (#6199) Y Ethan Guo 2022-07-24 22:08:33 -07:00
  • 76a28daeb0 [HUDI-4456] Close FileSystem in SparkClientFunctionalTestHarness (#6201) Tim Brown 2022-07-24 21:42:15 -07:00
  • 2a08a65f71 [MINOR] Fix typos in Spark client related classes (#6204) Vander 2022-07-25 12:41:42 +08:00
  • 1a910fd473 [HUDI-3510] Add sync validate procedure (#6200) simonsssu 2022-07-25 09:28:46 +08:00
  • a54c963543 [HUDI-4348] fix merge into sql data quality in concurrent scene (#6020) KnightChess 2022-07-24 21:29:47 +08:00
  • 1a5a9f7f03 [HUDI-4439] Fix Amazon CloudWatch reporter for metadata enabled tables (#6164) Rahil C 2022-07-23 21:08:21 -07:00
  • ba11082282 [HUDI-4450] Revert the checkpoint abort notification (#6181) Danny Chan 2022-07-24 08:44:22 +08:00
  • a0ffd05b77 [HUDI-4448] Remove the latest commit refresh for timeline server (#6179) Danny Chan 2022-07-24 07:10:53 +08:00
  • 2d745057ea [HUDI-4420] Fixing table schema delineation on partition/data schema for Spark relations (#5708) Alexey Kudinkin 2022-07-23 14:59:16 -07:00
  • da28e38fe3 [HUDI-4071] Make NONE sort mode as default for bulk insert (#6195) Sagar Sumit 2022-07-24 01:07:04 +05:30
  • f1f0109ab8 [HUDI-4440] Treat boostrapped table as non-partitioned in HudiFileIndex if partition column is missing from schema (#6163) Rahil C 2022-07-23 11:44:40 -07:00
  • f0e843249c [MINOR] Bump CI timeout to 150m (#6198) Shiyan Xu 2022-07-23 10:07:51 -05:00
  • 859157ec01 [MINOR] Fix Call Procedure code style (#6186) superche 2022-07-23 17:18:38 +08:00
  • a5348cc685 [HUDI-4436] Invalidate cached table in Spark after write (#6159) Rahil C 2022-07-22 22:47:47 -07:00
  • 340c3dbbe1 [HUDI-4437] Fix test conflicts by clearing file system cache (#6123) 冯健 2022-07-23 08:58:04 +08:00
  • af10a97e7a [HUDI-4435] Fix Avro field not found issue introduced by Avro 1.10 (#6155) Rahil C 2022-07-22 17:26:16 -07:00
  • d5c7c79d87 Revert "[HUDI-4324] Remove use_jdbc config from hudi sync (#6072)" (#6160) Shiyan Xu 2022-07-22 19:18:45 -05:00
  • a36762a862 [HUDI-4303] Use Hive sentinel value as partition default to avoid type caste issues (#5954) Sagar Sumit 2022-07-23 05:44:36 +05:30
  • 39f2a06c85 [HUDI-3979] Optimize out mandatory columns when no merging is performed (#5430) Alexey Kudinkin 2022-07-22 15:32:44 -07:00
  • 6b84384022 Revert "[MINOR] Fix CI issue with TestHiveSyncTool (#6110)" (#6192) Shiyan Xu 2022-07-22 14:20:39 -05:00
  • 716dd3512b [MINOR] Disable Flink compactor IT test (#6189) Sagar Sumit 2022-07-22 22:46:55 +05:30
  • eea4a692c0 [HUDI-4039] Make sure all builtin KeyGenerators properly implement Spark specific APIs (#5523) Alexey Kudinkin 2022-07-22 08:35:07 -07:00
  • d5c904e10e [MINOR] Fix CI issue with TestHiveSyncTool (#6110) Shiyan Xu 2022-07-22 10:30:00 -05:00
  • 41653fc708 [MINOR] Fallback to default for hive-style partitioning, url-encoding configs (#6175) Alexey Kudinkin 2022-07-22 06:25:58 -07:00
  • 51b5783161 [HUDI-4404] Fix insert into dynamic partition write misalignment (#6124) ForwardXu 2022-07-22 09:40:52 +08:00
  • 8e0b47e360 [MINOR] Fix result missing information issue in commits_compare Procedure (#6165) superche 2022-07-22 07:25:22 +08:00
  • 36e656aa77 [HUDI-4247] Upgrading protocol buffers version for presto bundle (#5852) Sivabalan Narayanan 2022-07-21 18:58:40 -04:00
  • 2e0dd29714 [HUDI-4204] Fixing NPE with row writer path and with OCC (#5850) Sivabalan Narayanan 2022-07-21 18:57:34 -04:00
  • 50cdb867c7 [HUDI-4400] Fix missing bloom filters in metadata table in non-partitioned table (#6113) Y Ethan Guo 2022-07-21 11:38:25 -07:00
  • f52b93fd10 Merge pull request #6154 from rahil-c/rahil-c/disable-emrSpark-properties wenningd 2022-07-21 11:35:52 -07:00
  • 2bf7920bd9 [MINOR] Add logger for HoodieCopyOnWriteTableInputFormat (#6161) Rahil C 2022-07-21 09:57:18 -07:00
  • a33bdd32e3 [HUDI-3993] Replacing UDF in Bulk Insert w/ RDD transformation (#5470) Alexey Kudinkin 2022-07-21 06:20:47 -07:00
  • c7fe3fd01d [HUDI-3764] Allow loading external configs while querying Hudi tables with Spark (#4915) wenningd 2022-07-21 02:42:17 -07:00
  • de37774e12 [HUDI-3896] Porting Nested Schema Pruning optimization for Hudi's custom Relations (#5428) Alexey Kudinkin 2022-07-21 02:36:06 -07:00
  • 2394c62973 [HUDI-4146][RFC-55] Update config changes proposal (#6162) Shiyan Xu 2022-07-21 02:25:02 -05:00
  • 348519f3cd [HUDI-4427] Add a computed column IT test (#6150) Danny Chan 2022-07-21 09:38:26 +08:00
  • 473be87aa5 Disable EmrFS file metadata caching and EMR Spark's data prefetcher feature Rahil Chertara 2022-07-20 17:04:00 -07:00
  • 2b828ccb98 [HUDI-4401] Skip HBase version check (#6114) Y Ethan Guo 2022-07-20 14:09:45 -07:00
  • e3675fe9b0 [HUDI-4372] Enable matadata table by default for flink (#6066) Danny Chan 2022-07-20 16:10:19 +08:00
  • 6c3578069e [HUDI-4416] Default database path for hoodie hive catalog (#6136) Danny Chan 2022-07-19 15:38:47 +08:00
  • 382d19e85b [HUDI-4065] Add FileBasedLockProvider (#6071) 冯健 2022-07-19 07:52:47 +08:00
  • 1959b843b7 [HUDI-4409] Improve LockManager wait logic when catch exception (#6122) liujinhui 2022-07-18 22:45:52 +08:00
  • 9282611bae [HUDI-4098] Support HMS for flink HudiCatalog (#6082) Bo Cui 2022-07-18 11:46:23 +08:00
  • 3964c476e0 Fix file group count issue with metadata partitions (#5892) Sivabalan Narayanan 2022-07-17 18:49:29 -07:00
  • ded197800a [HUDI-4170] Make user can use hoodie.datasource.read.paths to read necessary files (#5722) RexAn 2022-07-17 16:11:45 +08:00
  • 4bda6afe0b [HUDI-4249] Fixing in-memory HoodieData implementation to operate lazily (#5855) Alexey Kudinkin 2022-07-16 16:26:48 -07:00
  • 80368a049d [HUDI-3503] Add call procedure for CleanCommand (#6065) simonsssu 2022-07-16 22:33:26 +08:00
  • 6aec9d754f [HUDI-4408] Reuse old rollover file as base file for flink merge handle (#6120) Danny Chan 2022-07-16 20:46:23 +08:00
  • 0faa562b6f [HUDI-4403] Fix the end input metadata for bounded source (#6116) Danny Chan 2022-07-16 12:02:17 +08:00
  • 726e8e3590 [MINOR] Disable TestHiveSyncGlobalCommitTool (#6119) Shiyan Xu 2022-07-15 12:23:21 -05:00
  • b781b31045 [HUDI-4397] Flink Inline Cluster and Compact plan distribute strategy changed from rebalance to hash to avoid potential multiple threads accessing the same file (#6106) JerryYue-M 2022-07-15 12:21:50 +08:00
  • 4898ea52f7 [HUDI-4399][RFC-57] Claim RFC 57 for DeltaStreamer proto support (#6112) Tim Brown 2022-07-14 18:11:45 -07:00
  • 05606708fa [HUDI-4393] Add marker file for target file when flink merge handle rolls over (#6103) Danny Chan 2022-07-14 16:00:08 +08:00
  • aaccc63ad5 [RFC-51] [HUDI-3478] Hudi to support Change-Data-Capture (#5436) Yann Byron 2022-07-14 15:36:26 +08:00
  • e70a427956 [HUDI-4391] Incremental read from archived commits for flink (#6096) Danny Chan 2022-07-14 15:19:26 +08:00
  • ee956b8951 [HUDI-4379] Bump Flink versions to 1.14.5 and 1.15.1 (#6080) Luning (Lucas) Wang 2022-07-12 15:03:24 +08:00
  • 994c561488 [HUDI-4298] When reading the mor table with QUERY_TYPE_SNAPSHOT,Unabl… (#5937) HunterXHunter 2022-07-12 14:49:44 +08:00
  • a270eeeef9 [MINOR] Update RFCs status (#6078) Sagar Sumit 2022-07-11 13:04:25 +05:30
  • 51244eba82 [HUDI-4323] Make database table names optional in sync tool (#6073) Shiyan Xu 2022-07-10 23:33:31 -05:00
  • 63f95ab801 [HUDI-3730][RFC-55] Improve hudi-sync classes design and simplify configs (#5695) 冯健 2022-07-10 14:12:34 +08:00
  • 046044c83d [HUDI-4324] Remove use_jdbc config from hudi sync (#6072) Shiyan Xu 2022-07-10 00:46:09 -05:00
  • 10aec07fd2 [MINOR] Bump xalan from 2.7.1 to 2.7.2 (#6062) dependabot[bot] 2022-07-09 20:02:36 +05:30
  • 126b88b48d [HUDI-2150] Rename/Restructure configs for better modularity (#6061) liujinhui 2022-07-09 22:30:48 +08:00
  • 6566fc6625 [HUDI-3500] Add call procedure for RepairsCommand (#6053) superche 2022-07-09 09:29:14 +08:00
  • b686c07407 [HUDI-4276] Reconcile schema-inject null values for missing fields and add new fields (#6017) xiarixiaoyao 2022-07-09 03:08:38 +08:00
  • fc8d96246a [HUDI-4335] Bug fixes in AWSGlueCatalogSyncClient post schema evolution. (#5995) Kumud Kumar Srivatsava Tirupati 2022-07-08 20:17:49 +05:30
  • f20acb8dc3 [HUDI-4367] Support copyToTable on call (#6054) 苏承祥 2022-07-08 15:08:11 +08:00
  • a998586396 [minor] following 4152, refactor the clazz about plan selection strategy (#6060) Danny Chan 2022-07-08 09:56:10 +08:00