v-zhangjc9
32f7e323dc
Change version to private
2024-05-24 15:16:38 +08:00
Shiyan Xu
717f159bfd
[HUDI-3730] Keep metasync configs backward compatible ( #6221 )
2022-07-27 16:00:44 +05:30
Shiyan Xu
eee6a02f77
[HUDI-4456] Clean up test resources ( #6203 )
2022-07-25 10:13:06 -05:00
Shiyan Xu
71c2c3102b
[HUDI-4455] Improve test classes for TestHiveSyncTool ( #6202 )
...
Improve HiveTestService, HiveTestUtil, and related classes.
2022-07-25 19:05:34 +05:30
冯健
340c3dbbe1
[HUDI-4437] Fix test conflicts by clearing file system cache ( #6123 )
...
Co-authored-by: jian.feng <fengjian428@gmial.com >
Co-authored-by: jian.feng <jian.feng@shopee.com >
Co-authored-by: Raymond Xu <2701446+xushiyan@users.noreply.github.com >
2022-07-22 17:58:04 -07:00
Shiyan Xu
d5c7c79d87
Revert "[HUDI-4324] Remove use_jdbc config from hudi sync ( #6072 )" ( #6160 )
...
This reverts commit 046044c83d .
2022-07-22 17:18:45 -07:00
Shiyan Xu
6b84384022
Revert "[MINOR] Fix CI issue with TestHiveSyncTool ( #6110 )" ( #6192 )
...
This reverts commit d5c904e10e .
2022-07-22 12:20:39 -07:00
Shiyan Xu
d5c904e10e
[MINOR] Fix CI issue with TestHiveSyncTool ( #6110 )
2022-07-22 10:30:00 -05:00
Shiyan Xu
726e8e3590
[MINOR] Disable TestHiveSyncGlobalCommitTool ( #6119 )
2022-07-15 10:23:21 -07:00
Shiyan Xu
046044c83d
[HUDI-4324] Remove use_jdbc config from hudi sync ( #6072 )
...
* [HUDI-4324] Remove use_jdbc config from hudi sync
* Users should use HIVE_SYNC_MODE instead
2022-07-10 11:16:09 +05:30
Shiyan Xu
c0e1587966
[HUDI-3730] Improve meta sync class design and hierarchies ( #5854 )
...
* [HUDI-3730] Improve meta sync class design and hierarchies (#5754 )
* Implements class design proposed in RFC-55
Co-authored-by: jian.feng <fengjian428@gmial.com >
Co-authored-by: jian.feng <jian.feng@shopee.com >
2022-07-03 14:47:25 +05:30
bschell
fd7d25ab63
[HUDI-1176] Upgrade hudi to log4j2 ( #5366 )
...
* Move to log4j2
cr: https://code.amazon.com/reviews/CR-71010705
* Upgrade unit tests to log4j2
* update exclusion
Co-authored-by: Brandon Scheller <bschelle@amazon.com >
2022-06-28 12:54:23 -07:00
Raymond Xu
1349b596a1
[HUDI-4198] Fix hive config for AWSGlueClientFactory ( #5768 )
...
* HiveConf needs to load fs conf to allow instantiation via AWSGlueClientFactory
* Resolve metastore uri config before loading fs conf
* Skip hiveql due to CI issue
Co-authored-by: Sagar Sumit <sagarsumit09@gmail.com >
2022-06-07 20:21:31 +05:30
Heap
47b764ec33
[HUDI-4134] Fix Method naming consistency issues in FSUtils ( #5655 )
2022-05-23 15:28:48 -07:00
felixYyu
716e995a38
[MINOR] Removing redundant semicolons and line breaks ( #5662 )
2022-05-23 15:26:36 -07:00
huberylee
85b146d3d5
[HUDI-3985] Refactor DLASyncTool to support read hoodie table as spark datasource table ( #5532 )
2022-05-20 22:25:32 +08:00
董可伦
b8e465fdfc
[MINOR] Fix typos in log4j-surefire.properties ( #5212 )
2022-04-15 13:33:37 -07:00
Raymond Xu
e96f08f355
Moving to 0.12.0-SNAPSHOT on master branch.
2022-04-06 15:24:10 +08:00
ForwardXu
3449e86989
[HUDI-3780] improve drop partitions ( #5178 )
2022-04-05 11:52:33 +08:00
todd5167
eef3f9c74a
[HUDI-3771] flink supports sync table information to aws glue ( #5202 )
2022-04-02 21:16:10 +08:00
ForwardXu
0802510ca9
[HUDI-2520] Fix drop partition issue when sync to hive ( #5147 )
2022-03-29 11:28:19 -07:00
Raymond Xu
6ccbae4d2a
[HUDI-2757] Implement Hudi AWS Glue sync ( #5076 )
2022-03-28 14:54:59 -04:00
Raymond Xu
686da41696
[HUDI-3689] Fix UT failures in TestHoodieDeltaStreamer ( #5120 )
2022-03-24 09:10:33 -07:00
Rajesh Mahindra
5f570ea151
[HUDI-2883] Refactor hive sync tool / config to use reflection and standardize configs ( #4175 )
...
- Refactor hive sync tool / config to use reflection and standardize configs
Co-authored-by: sivabalan <n.siva.b@gmail.com >
Co-authored-by: Rajesh Mahindra <rmahindra@Rajeshs-MacBook-Pro.local >
Co-authored-by: Raymond Xu <2701446+xushiyan@users.noreply.github.com >
2022-03-21 22:56:31 -04:00
MrSleeping123
8859b48b2a
[HUDI-3383] Sync column comments while syncing a hive table ( #4960 )
...
Desc: Add a hive sync config(hoodie.datasource.hive_sync.sync_comment). This config defaults to false.
While syncing data source to hudi, add column comments to source avro schema, and the sync_comment is true, syncing column comments to the hive table.
2022-03-10 09:44:39 +08:00
Yann Byron
2fe7a3a41f
[HUDI-2610] pass the spark version when sync the table created by spark ( #4758 )
...
* [HUDI-2610] pass the spark version when sync the table created by spark
* [MINOR] sync spark version in DataSourceUtils#buildHiveSyncConfig
2022-02-10 21:05:28 +05:30
ehui
538db185ca
[HUDI-2491] Expose HMS mode metastore uri config option for spark writer ( #3962 )
2022-02-07 18:13:51 +05:30
Alexey Kudinkin
a68e1dc2db
[HUDI-431] Adding support for Parquet in MOR LogBlocks ( #4333 )
...
- Adding support for Parquet in MOR tables Log blocks
Co-authored-by: Sivabalan Narayanan <n.siva.b@gmail.com >
2022-02-02 14:35:05 -05:00
董可伦
822230d9ea
[MINOR] Optimize variable names and logs ( #4581 )
2022-01-16 16:09:22 +08:00
Sagar Sumit
12e95771ee
[HUDI-3235] Fix ClassNotFoundException due to log4j-core dependency ( #4574 )
...
- Move log4j-core to top level pom
2022-01-12 11:53:43 -05:00
董可伦
017ddbbfac
[MINOR] Fix typos ( #4567 )
2022-01-11 23:17:10 -08:00
Pratyaksh Sharma
a392e9ba46
[HUDI-485] Corrected the check for incremental sql ( #2768 )
...
* [HUDI-485]: corrected the check for incremental sql
* [HUDI-485]: added tests
* code review comments addressed
* [HUDI-485]: added happy flow test case
2022-01-12 08:22:07 +05:30
YueZhang
cf362fb2d5
[MINOR] Fix some code style issues based on check-style plugin ( #4532 )
...
Co-authored-by: yuezhang <yuezhang@freewheel.tv >
2022-01-09 01:14:56 -08:00
董可伦
4f6cdd73a3
[HUDI-3192] Spark metastore schema evolution broken ( #4533 )
2022-01-08 10:48:37 +08:00
董可伦
b1df60672b
[MINOR] fix typos in DDLExecutor ( #4534 )
2022-01-07 07:59:55 -05:00
Danny Chan
0e297c0c4c
[HUDI-3171] Sync empty table to hive metastore ( #4511 )
2022-01-05 16:41:33 +08:00
YueZhang
1e2d2c437d
[HUDI-3138] Fix broken UT test for TestHiveSyncTool.testDropPartitions ( #4493 )
...
Co-authored-by: yuezhang <yuezhang@freewheel.tv >
2022-01-02 22:43:30 -05:00
YueZhang
ef9923fc55
[HUDI-3107]Fix HiveSyncTool drop partitions using JDBC or hivesql or hms ( #4453 )
...
* constructDropPartitions when drop partitions using jdbc
* done
* done
* code style
* code review
Co-authored-by: yuezhang <yuezhang@freewheel.tv >
2021-12-31 15:56:33 +08:00
Shawy Geng
a4e622ac61
[HUDI-1951] Add bucket hash index, compatible with the hive bucket ( #3173 )
...
* [HUDI-2154] Add index key field to HoodieKey
* [HUDI-2157] Add the bucket index and its read/write implemention of Spark engine.
* revert HUDI-2154 add index key field to HoodieKey
* fix all comments and introduce a new tricky way to get index key at runtime
support double insert for bucket index
* revert spark read optimizer based on bucket index
* add the storage layout
* index tag, hash function and add ut
* fix ut
* address partial comments
* Code review feedback
* add layout config and docs
* fix ut
* rename hoodie.layout and rebase master
Co-authored-by: Vinoth Chandar <vinoth@apache.org >
2021-12-30 12:38:26 -08:00
Udit Mehrotra
9412281cb1
[HUDI-2983] Remove Log4j2 transitive dependencies ( #4281 )
2021-12-28 07:15:05 -08:00
ForwardXu
32505d5adb
[HUDI-3106] Fix HiveSyncTool not sync schema ( #4452 )
2021-12-27 22:11:14 -08:00
ForwardXu
dd96129191
[HUDI-2990] Sync to HMS when deleting partitions ( #4291 )
2021-12-13 20:40:06 +08:00
fengli
568181a3e7
[HUDI-2934] Optimize RequestHandler code style
...
close apache/hudi#4215
2021-12-04 15:30:52 +08:00
yuzhao.cyz
a1d0ff4209
Moving to 0.11.0-SNAPSHOT on master branch.
2021-11-27 17:22:10 +08:00
Nate Radtke
887787e8b9
[HUDI-1932] Update Hive sync timestamp when change detected ( #3053 )
...
* Update Hive sync timestamp when change detected
Only update the last commit timestamp on the Hive table when the table schema
has changed or a partition is created/updated.
When using AWS Glue Data Catalog as the metastore for Hive this will ensure
that table versions are substantive (including schema and/or partition
changes). Prior to this change when a Hive sync is performed without schema
or partition changes the table in the Glue Data Catalog would have a new
version published with the only change being the timestamp property.
https://issues.apache.org/jira/browse/HUDI-1932
* add conditional sync flag
* fix testSyncWithoutDiffs
* fix HiveSyncConfig
Co-authored-by: Raymond Xu <2701446+xushiyan@users.noreply.github.com >
2021-11-21 12:11:05 +05:30
xiarixiaoyao
acc40625f5
[HUDI-2676] Hudi should synchronize owner information to hudi _rt/_ro table. ( #3911 )
2021-11-03 20:36:01 +08:00
Yann Byron
1f17467f73
[HUDI-1869] Upgrading Spark3 To 3.1 ( #3844 )
...
Co-authored-by: pengzhiwei <pengzhiwei2015@icloud.com >
2021-11-02 18:25:12 -07:00
vinoyang
b1c4acf0ae
[HUDI-2614] Remove duplicated hadoop-hdfs with tests classifier exists in bundles ( #3864 )
2021-10-26 22:36:10 +08:00
vinoyang
220bf6a7e6
[HUDI-2600] Remove duplicated hadoop-common with tests classifier exists in bundles ( #3847 )
2021-10-25 13:45:28 +08:00
董可伦
48a3906ccc
[MINOR] Fix typo,'paritition' corrected to 'partition' ( #3764 )
2021-10-11 14:07:34 -04:00