lanyuanxiaoyao/hudi - hudi - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
Alexey Kudinkin	c05a4e7b6f	[HUDI-3934] Fix `Spark32HoodieParquetFileFormat` not being compatible w/ Spark 3.2.0 (#5378 ) - Due to the fact that Spark 3.2.1 is non-BWC w/ 3.2.0, we have to handle all these incompatibilities in Spark32HoodieParquetFileFormat. This PR is addressing that. Co-authored-by: Raymond Xu <2701446+xushiyan@users.noreply.github.com>	2022-04-21 21:00:38 -04:00
Y Ethan Guo	c4bc2deea0	[HUDI-3936] Fix projection for a nested field as pre-combined key (#5379 ) This PR fixes the projection logic around a nested field which is used as the pre-combined key field. The fix is to only check and append the root level field for projection, i.e., "a", for a nested field "a.b.c" in the mandatory columns. - Changes the logic to check and append the root level field for a required nested field in the mandatory columns in HoodieBaseRelation.appendMandatoryColumns	2022-04-21 20:17:57 -04:00
xiarixiaoyao	037f89ee7c	[HUDI-3921] Fixed schema evolution cannot work with HUDI-3855 (#5376 ) - when columns names are renamed (schema evolution enabled), while copying records from old data file with HoodieMergeHande, renamed columns wasn't handled well.	2022-04-21 18:27:54 -04:00
Sagar Sumit	de5fa1fe03	[HUDI-3940] Fix retry count increment in lock manager (#5387 )	2022-04-21 16:52:05 -04:00
Raymond Xu	4e1ac467da	[MINOR] Increase azure CI timeout to 120m (#5384 )	2022-04-21 04:35:44 -07:00
Alexey Kudinkin	4b296f79cc	[HUDI-3935] Adding config to fallback to enabled Partition Values extraction from Partition path (#5377 )	2022-04-21 01:36:19 -07:00
Sivabalan Narayanan	a9506aa545	[HUDI-3938] Fix default value for num retries to acquire lock (#5380 )	2022-04-21 01:08:43 -07:00
Alexey Kudinkin	f7544e23ac	[HUDI-3204] Fixing partition-values being derived from partition-path instead of source columns (#5364 ) - Scaffolded `Spark24HoodieParquetFileFormat` extending `ParquetFileFormat` and overriding the behavior of adding partition columns to every row - Amended `SparkAdapter`s `createHoodieParquetFileFormat` API to be able to configure whether to append partition values or not - Fallback to append partition values in cases when the source columns are not persisted in data-file - Fixing HoodieBaseRelation incorrectly handling mandatory columns	2022-04-20 19:30:27 +08:00
吴祥平	408663c42b	[HUDI-3912] Fix lose data when rollback in flink async compact (#5357 ) * stop add event when has failed compact event Co-authored-by: wxp <wxp4532@outlook.com>	2022-04-20 19:23:39 +08:00
Zhaojing Yu	6a3ce928b1	[HUDI-3904] Claim RFC number for Improve timeline server (#5354 )	2022-04-19 23:31:21 -07:00
Danny Chan	7a9e411e9d	[HUDI-3917] Flink write task hangs if last checkpoint has no data input (#5360 )	2022-04-20 12:48:24 +08:00
Y Ethan Guo	28fdddfee0	[HUDI-3920] Fix partition path construction in metadata table validator (#5365 )	2022-04-19 19:40:09 -04:00
Y Ethan Guo	6f3fe880d2	[HUDI-3905] Add S3 related setup in Kafka Connect quick start (#5356 )	2022-04-19 15:08:28 -07:00
Alexey Kudinkin	81bf771e56	[HUDI-3902] Fallback to `HadoopFsRelation` in cases non-involving Schema Evolution (#5352 ) Co-authored-by: Raymond Xu <2701446+xushiyan@users.noreply.github.com>	2022-04-19 10:40:20 -07:00
Raymond Xu	9af7b09aec	[HUDI-3894] Fix gcp bundle to include HBase dependencies and shading (#5349 )	2022-04-18 21:47:10 -07:00
Sagar Sumit	4f44e6aeb5	[HUDI-3899] Drop index to delete pending index instants from timeline if applicable (#5342 ) Co-authored-by: sivabalan <n.siva.b@gmail.com>	2022-04-18 22:28:46 -04:00
Y Ethan Guo	52d878c52b	[HUDI-3903] Fix NoClassDefFoundError with Kafka Connect bundle (#5353 )	2022-04-18 21:17:53 -04:00
Y Ethan Guo	ef6c5611dc	[HUDI-3894] Fix datahub to include HBase dependencies and shading (#5338 ) Co-authored-by: Raymond Xu <2701446+xushiyan@users.noreply.github.com>	2022-04-18 16:20:50 -07:00
Alexey Kudinkin	7ecb47cd21	[HUDI-3895] Fixing file-partitioning seq for base-file only views to make sure we bucket the files efficiently (#5337 )	2022-04-18 16:06:52 -04:00
Sagar Sumit	1718bcab84	[HUDI-3707] Fix target schema handling in HoodieSparkUtils while creating RDD (#5347 )	2022-04-18 13:34:04 -04:00
Sivabalan Narayanan	b00d03fd62	[HUDI-3886] Adding default null for some of the fields in col stats in MDT schema (#5329 )	2022-04-18 10:37:03 -04:00
Sivabalan Narayanan	05dfc39c29	Fixing async clustering job test in TestHoodieDeltaStreamer (#5317 )	2022-04-18 17:38:33 +05:30
董可伦	b8e465fdfc	[MINOR] Fix typos in log4j-surefire.properties (#5212 )	2022-04-15 13:33:37 -07:00
董可伦	99dd1cb6e6	[HUDI-3835] Add UT for delete in java client (#5270 )	2022-04-15 15:03:48 -04:00
Sivabalan Narayanan	e8ab915aff	[MINOR] Removing invalid code to close parquet reader iterator (#5182 )	2022-04-15 14:50:07 -04:00
Sivabalan Narayanan	57612c5c32	[HUDI-3848] Fixing restore with cleaned up commits (#5288 )	2022-04-15 14:47:53 -04:00
Raymond Xu	9e8664f4d2	[HOTFIX] add missing license (#5322 ) (#5324 )	2022-04-14 12:35:20 -07:00
Raymond Xu	d6a64f765e	Revert "[HUDI-3652] Make ObjectSizeCalculator threadlocal to reduce memory footprint (#5060 )" (#5323 ) This reverts commit `f0ab4a6e9e`.	2022-04-14 12:28:27 -07:00
sekaiga	f0ab4a6e9e	[HUDI-3652] Make ObjectSizeCalculator threadlocal to reduce memory footprint (#5060 ) Co-authored-by: zhouhuidong <zhouhuidong@bilibili.co>	2022-04-14 03:08:14 -07:00
ForwardXu	6621f3cdbb	[HUDI-3845] Fix delete mor table's partition with urlencode's error (#5282 )	2022-04-14 01:49:00 -07:00
ForwardXu	44b3630b5d	[HUDI-3826] Make truncate partition use delete_partition operation (#5272 ) Make truncate partition and drop partition behave as drop partition with purge, which delete all records via Hudi DELETE_PARTITION; partition removed from metastore	2022-04-14 00:53:05 -07:00
Sivabalan Narayanan	a081c2b9b5	[HUDI-3876] Fixing fetching partitions in GlueSyncClient (#5318 )	2022-04-13 21:03:05 -07:00
Y Ethan Guo	571cbe4c11	[MINOR] Code cleanup in test utils (#5312 )	2022-04-13 17:37:07 -04:00
Y Ethan Guo	bab691692e	[HUDI-3686] Fix inline and async table service check in HoodieWriteConfig (#5307 )	2022-04-13 17:33:26 -04:00
Y Ethan Guo	c7f41f9018	[HUDI-3869] Improve error handling of loading Hudi conf (#5311 )	2022-04-13 17:25:31 -04:00
Danny Chan	6f9b02decb	[HUDI-3870] Add timeout rollback for flink online compaction (#5314 )	2022-04-13 20:05:48 +08:00
Danny Chan	0281725c6b	[MINOR] Inline the partition path logic into the builder (#5310 )	2022-04-13 16:54:39 +05:30
Danny Chan	43de2b4702	[HUDI-3868] Disable the sort input for flink streaming append mode (#5309 )	2022-04-13 14:21:08 +08:00
Alexey Kudinkin	434e782b7d	[HUDI-3867] Disable Data Skipping by default (#5306 )	2022-04-13 11:21:12 +05:30
Alexey Kudinkin	7b78dff45f	[HUDI-3855] Fixing `FILENAME_METADATA_FIELD` not being correctly updated in `HoodieMergeHandle` (#5296 ) Fixing FILENAME_METADATA_FIELD not being correctly updated in HoodieMergeHandle, in cases when old-record is carried over from existing file as is. - Revisited HoodieFileWriter API to accept HoodieKey instead of HoodieRecord - Fixed FILENAME_METADATA_FIELD not being overridden in cases when simply old record is carried over - Exposing standard JVM's debugger ports in Docker setup	2022-04-12 20:42:15 -04:00
Raymond Xu	2e6e302efe	[HUDI-3859] Fix spark profiles and utilities-slim dep (#5297 )	2022-04-12 15:33:08 -07:00
Vinoth Govindarajan	2d46d5287e	[HUDI-3838] Moved the getPartitionColumns logic to driver. (#5303 )	2022-04-12 18:03:00 -04:00
satishm	25dce94ba2	[MINOR] Integ Test Reducing partitions for log running multi partition yaml (#5300 )	2022-04-12 12:15:17 -04:00
Raymond Xu	84783b9779	[HUDI-3843] Make flink profiles build with scala-2.11 (#5279 )	2022-04-12 08:33:48 -07:00
Vinoth Govindarajan	d16740976e	[HUDI-3838] Implemented drop partition column feature for delta streamer code path (#5294 ) * [HUDI-3838] Implemented drop partition column feature for delta streamer code path * Ensure drop partition table config is updated in hoodie.props Co-authored-by: Sagar Sumit <sagarsumit09@gmail.com>	2022-04-12 18:10:30 +05:30
Alexey Kudinkin	101b82a679	[HUDI-3839] Fixing incorrect selection of MT partitions to be updated (#5274 ) * Fixing incorrect selection of MT partitions to be updated * Ensure that metadata partitions table config is inherited correctly Co-authored-by: Sagar Sumit <sagarsumit09@gmail.com>	2022-04-12 13:37:52 +05:30
Sivabalan Narayanan	f91e9e63e1	[HUDI-3799] Fixing not deleting empty instants w/o archiving (#5261 )	2022-04-11 21:02:43 -07:00
Sagar Sumit	3d8fc78c66	[HUDI-3844] Update props in indexer based on table config (#5293 )	2022-04-11 18:16:06 -04:00
Alexey Kudinkin	458fdd5611	[HUDI-3841] Fixing Column Stats in the presence of Schema Evolution (#5275 ) Currently, Data Skipping is not handling correctly the case when column-stats are not aligned and, for ex, some of the (column, file) combinations are missing from the CSI. This could occur in different scenarios (schema evolution, CSI config changes), and has to be handled properly when we're composing CSI projection for Data Skipping. This PR addresses that. - Added appropriate aligning for the transposed CSI projection	2022-04-11 15:45:53 -04:00
Sivabalan Narayanan	52ea1e4964	[MINOR] fixing timeline server for integ tests (#5289 )	2022-04-11 10:14:51 -04:00

1 2 3 4 5 ...

2793 Commits