hehuiyuan
bda3db078e
support generan parameter 'sink.parallelism' for flink-hudi ( #5405 )
...
Co-authored-by: hehuiyuan1 <hehuiyuan@jd.com >
2022-04-24 19:09:39 +08:00
miomiocat
5e5c177e4b
[HUDI-3923] Fix cast exception while reading boolean type of partitioned field ( #5373 )
2022-04-23 20:12:54 +08:00
Y Ethan Guo
8633bd6e06
[HUDI-3948] Fix presto bundle missing HBase classes ( #5398 )
2022-04-23 01:33:55 -07:00
Raymond Xu
505ee672ac
[HUDI-3950] add parquet-avro to gcp-bundle ( #5399 )
2022-04-23 11:59:49 +08:00
Sivabalan Narayanan
7523542c1d
[HUDI-3947] Fixing Hive conf usage in HoodieSparkSqlWriter ( #5401 )
2022-04-22 22:20:05 -04:00
Y Ethan Guo
20781a5fa6
[DOCS] Add commit activity, twitter badgers, and Hudi logo in README ( #5336 )
2022-04-22 16:51:07 +08:00
Alexey Kudinkin
c05a4e7b6f
[HUDI-3934] Fix Spark32HoodieParquetFileFormat not being compatible w/ Spark 3.2.0 ( #5378 )
...
- Due to the fact that Spark 3.2.1 is non-BWC w/ 3.2.0, we have to handle all these incompatibilities in Spark32HoodieParquetFileFormat. This PR is addressing that.
Co-authored-by: Raymond Xu <2701446+xushiyan@users.noreply.github.com >
2022-04-21 21:00:38 -04:00
Y Ethan Guo
c4bc2deea0
[HUDI-3936] Fix projection for a nested field as pre-combined key ( #5379 )
...
This PR fixes the projection logic around a nested field which is used as the pre-combined key field. The fix is to only check and append the root level field for projection, i.e., "a", for a nested field "a.b.c" in the mandatory columns.
- Changes the logic to check and append the root level field for a required nested field in the mandatory columns in HoodieBaseRelation.appendMandatoryColumns
2022-04-21 20:17:57 -04:00
xiarixiaoyao
037f89ee7c
[HUDI-3921] Fixed schema evolution cannot work with HUDI-3855 ( #5376 )
...
- when columns names are renamed (schema evolution enabled), while copying records from old data file with HoodieMergeHande, renamed columns wasn't handled well.
2022-04-21 18:27:54 -04:00
Sagar Sumit
de5fa1fe03
[HUDI-3940] Fix retry count increment in lock manager ( #5387 )
2022-04-21 16:52:05 -04:00
Raymond Xu
4e1ac467da
[MINOR] Increase azure CI timeout to 120m ( #5384 )
2022-04-21 04:35:44 -07:00
Alexey Kudinkin
4b296f79cc
[HUDI-3935] Adding config to fallback to enabled Partition Values extraction from Partition path ( #5377 )
2022-04-21 01:36:19 -07:00
Sivabalan Narayanan
a9506aa545
[HUDI-3938] Fix default value for num retries to acquire lock ( #5380 )
2022-04-21 01:08:43 -07:00
Alexey Kudinkin
f7544e23ac
[HUDI-3204] Fixing partition-values being derived from partition-path instead of source columns ( #5364 )
...
- Scaffolded `Spark24HoodieParquetFileFormat` extending `ParquetFileFormat` and overriding the behavior of adding partition columns to every row
- Amended `SparkAdapter`s `createHoodieParquetFileFormat` API to be able to configure whether to append partition values or not
- Fallback to append partition values in cases when the source columns are not persisted in data-file
- Fixing HoodieBaseRelation incorrectly handling mandatory columns
2022-04-20 19:30:27 +08:00
吴祥平
408663c42b
[HUDI-3912] Fix lose data when rollback in flink async compact ( #5357 )
...
* stop add event when has failed compact event
Co-authored-by: wxp <wxp4532@outlook.com >
2022-04-20 19:23:39 +08:00
Zhaojing Yu
6a3ce928b1
[HUDI-3904] Claim RFC number for Improve timeline server ( #5354 )
2022-04-19 23:31:21 -07:00
Danny Chan
7a9e411e9d
[HUDI-3917] Flink write task hangs if last checkpoint has no data input ( #5360 )
2022-04-20 12:48:24 +08:00
Y Ethan Guo
28fdddfee0
[HUDI-3920] Fix partition path construction in metadata table validator ( #5365 )
2022-04-19 19:40:09 -04:00
Y Ethan Guo
6f3fe880d2
[HUDI-3905] Add S3 related setup in Kafka Connect quick start ( #5356 )
2022-04-19 15:08:28 -07:00
Alexey Kudinkin
81bf771e56
[HUDI-3902] Fallback to HadoopFsRelation in cases non-involving Schema Evolution ( #5352 )
...
Co-authored-by: Raymond Xu <2701446+xushiyan@users.noreply.github.com >
2022-04-19 10:40:20 -07:00
Raymond Xu
9af7b09aec
[HUDI-3894] Fix gcp bundle to include HBase dependencies and shading ( #5349 )
2022-04-18 21:47:10 -07:00
Sagar Sumit
4f44e6aeb5
[HUDI-3899] Drop index to delete pending index instants from timeline if applicable ( #5342 )
...
Co-authored-by: sivabalan <n.siva.b@gmail.com >
2022-04-18 22:28:46 -04:00
Y Ethan Guo
52d878c52b
[HUDI-3903] Fix NoClassDefFoundError with Kafka Connect bundle ( #5353 )
2022-04-18 21:17:53 -04:00
Y Ethan Guo
ef6c5611dc
[HUDI-3894] Fix datahub to include HBase dependencies and shading ( #5338 )
...
Co-authored-by: Raymond Xu <2701446+xushiyan@users.noreply.github.com >
2022-04-18 16:20:50 -07:00
Alexey Kudinkin
7ecb47cd21
[HUDI-3895] Fixing file-partitioning seq for base-file only views to make sure we bucket the files efficiently ( #5337 )
2022-04-18 16:06:52 -04:00
Sagar Sumit
1718bcab84
[HUDI-3707] Fix target schema handling in HoodieSparkUtils while creating RDD ( #5347 )
2022-04-18 13:34:04 -04:00
Sivabalan Narayanan
b00d03fd62
[HUDI-3886] Adding default null for some of the fields in col stats in MDT schema ( #5329 )
2022-04-18 10:37:03 -04:00
Sivabalan Narayanan
05dfc39c29
Fixing async clustering job test in TestHoodieDeltaStreamer ( #5317 )
2022-04-18 17:38:33 +05:30
董可伦
b8e465fdfc
[MINOR] Fix typos in log4j-surefire.properties ( #5212 )
2022-04-15 13:33:37 -07:00
董可伦
99dd1cb6e6
[HUDI-3835] Add UT for delete in java client ( #5270 )
2022-04-15 15:03:48 -04:00
Sivabalan Narayanan
e8ab915aff
[MINOR] Removing invalid code to close parquet reader iterator ( #5182 )
2022-04-15 14:50:07 -04:00
Sivabalan Narayanan
57612c5c32
[HUDI-3848] Fixing restore with cleaned up commits ( #5288 )
2022-04-15 14:47:53 -04:00
Raymond Xu
9e8664f4d2
[HOTFIX] add missing license ( #5322 ) ( #5324 )
2022-04-14 12:35:20 -07:00
Raymond Xu
d6a64f765e
Revert "[HUDI-3652] Make ObjectSizeCalculator threadlocal to reduce memory footprint ( #5060 )" ( #5323 )
...
This reverts commit f0ab4a6e9e .
2022-04-14 12:28:27 -07:00
sekaiga
f0ab4a6e9e
[HUDI-3652] Make ObjectSizeCalculator threadlocal to reduce memory footprint ( #5060 )
...
Co-authored-by: zhouhuidong <zhouhuidong@bilibili.co >
2022-04-14 03:08:14 -07:00
ForwardXu
6621f3cdbb
[HUDI-3845] Fix delete mor table's partition with urlencode's error ( #5282 )
2022-04-14 01:49:00 -07:00
ForwardXu
44b3630b5d
[HUDI-3826] Make truncate partition use delete_partition operation ( #5272 )
...
Make truncate partition and drop partition behave as drop partition with purge, which delete all records via Hudi DELETE_PARTITION; partition removed from metastore
2022-04-14 00:53:05 -07:00
Sivabalan Narayanan
a081c2b9b5
[HUDI-3876] Fixing fetching partitions in GlueSyncClient ( #5318 )
2022-04-13 21:03:05 -07:00
Y Ethan Guo
571cbe4c11
[MINOR] Code cleanup in test utils ( #5312 )
2022-04-13 17:37:07 -04:00
Y Ethan Guo
bab691692e
[HUDI-3686] Fix inline and async table service check in HoodieWriteConfig ( #5307 )
2022-04-13 17:33:26 -04:00
Y Ethan Guo
c7f41f9018
[HUDI-3869] Improve error handling of loading Hudi conf ( #5311 )
2022-04-13 17:25:31 -04:00
Danny Chan
6f9b02decb
[HUDI-3870] Add timeout rollback for flink online compaction ( #5314 )
2022-04-13 20:05:48 +08:00
Danny Chan
0281725c6b
[MINOR] Inline the partition path logic into the builder ( #5310 )
2022-04-13 16:54:39 +05:30
Danny Chan
43de2b4702
[HUDI-3868] Disable the sort input for flink streaming append mode ( #5309 )
2022-04-13 14:21:08 +08:00
Alexey Kudinkin
434e782b7d
[HUDI-3867] Disable Data Skipping by default ( #5306 )
2022-04-13 11:21:12 +05:30
Alexey Kudinkin
7b78dff45f
[HUDI-3855] Fixing FILENAME_METADATA_FIELD not being correctly updated in HoodieMergeHandle ( #5296 )
...
Fixing FILENAME_METADATA_FIELD not being correctly updated in HoodieMergeHandle, in cases when old-record is carried over from existing file as is.
- Revisited HoodieFileWriter API to accept HoodieKey instead of HoodieRecord
- Fixed FILENAME_METADATA_FIELD not being overridden in cases when simply old record is carried over
- Exposing standard JVM's debugger ports in Docker setup
2022-04-12 20:42:15 -04:00
Raymond Xu
2e6e302efe
[HUDI-3859] Fix spark profiles and utilities-slim dep ( #5297 )
2022-04-12 15:33:08 -07:00
Vinoth Govindarajan
2d46d5287e
[HUDI-3838] Moved the getPartitionColumns logic to driver. ( #5303 )
2022-04-12 18:03:00 -04:00
satishm
25dce94ba2
[MINOR] Integ Test Reducing partitions for log running multi partition yaml ( #5300 )
2022-04-12 12:15:17 -04:00
Raymond Xu
84783b9779
[HUDI-3843] Make flink profiles build with scala-2.11 ( #5279 )
2022-04-12 08:33:48 -07:00