lanyuanxiaoyao/hudi - hudi - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
t0il3ts0ap	c8df9b09d7	[HUDI-3148] Create pushgateway client based on port (#4497 ) Co-authored-by: anoop narang <anoop.narang@navi.com> Co-authored-by: sivabalan narayanan <n.siva.b@gmail.com>	2022-01-10 18:09:47 -05:00
Y Ethan Guo	f230eca9b5	[MINOR] Fix port number in setupKafka.sh (#4546 )	2022-01-10 16:07:52 -05:00
Sivabalan Narayanan	7a8b94c82d	[HUDI-3180] Include files from completed commits while bootstrapping metadata table (#4519 )	2022-01-10 15:33:15 -05:00
Y Ethan Guo	bc95571caa	[HUDI-2735] Allow empty commits in Kafka Connect Sink for Hudi (#4544 )	2022-01-10 15:31:25 -05:00
Manoj Govindassamy	251d4eb3b6	[HUDI-3030] InProcessLockPovider as default when any async servcies enabled with no lock provider override (#4406 ) * [HUDI-3030] InProcessLockPovider as default when any async servcies enabled with no lock provider override - Making InProcessLockProvider as the default lock provider when any async services are enabled and when no lock provider is explicitly set. - This is the workaround for metadata table updates racing with async table serice operations * [HUDI-3030] InProcessLockPovider as default when any async servcies enabled with no lock provider override - Renaming isAnyTableServicesInline/Async() to areAnyTableServicesInline/Async() * [HUDI-3030] InProcessLockPovider as default when any async servcies enabled with no lock provider override - Additionally checking for write config properties when verifying the lock provider override. Updated the unit test for this case.	2022-01-10 08:40:24 +05:30
Sivabalan Narayanan	56f93f4ebd	Removing rollbacks instants from timeline for restore operation (#4518 )	2022-01-10 07:44:28 +05:30
Thinking Chen	e9a7f49f55	[HUDI-3112] Fix KafkaConnect cannot sync to Hive Problem (#4458 )	2022-01-09 15:31:57 -08:00
Sivabalan Narayanan	604d9885f1	[HUDI-3009] making some fixes to S3 incremental source (#4517 )	2022-01-09 12:46:52 -05:00
RexAn	977d3c6dad	[HUDI-3157] Remove aws jars from hudi bundles (#4542 ) Co-authored-by: Hui An <hui.an@shopee.com>	2022-01-09 02:23:46 -08:00
YueZhang	cf362fb2d5	[MINOR] Fix some code style issues based on check-style plugin (#4532 ) Co-authored-by: yuezhang <yuezhang@freewheel.tv>	2022-01-09 01:14:56 -08:00
Yann Byron	36790709f7	[HUDI-3125] spark-sql write timestamp directly (#4471 )	2022-01-08 23:43:25 -08:00
Thinking Chen	0d8ca8da4e	[HUDI-3104] Kafka-connect support of hadoop config environments and properties (#4451 )	2022-01-08 23:10:17 -08:00
Sivabalan Narayanan	98ec215079	[HUDI-3178] Fixing metadata table compaction so as to not include uncommitted data (#4530 ) - There is a chance that the actual write eventually failed in data table but commit was successful in Metadata table, and if compaction was triggered in MDT, compaction could have included the uncommitted data. But once compacted, it may never be ignored while reading from metadata table. So, this patch fixes the bug. Metadata table compaction is triggered before applying the commit to metadata table to circumvent this issue.	2022-01-08 10:34:47 -05:00
Sagar Sumit	46bb00e4df	[HUDI-3139] Shade htrace and parquet-avro in presto bundle (#4495 ) Filter out unnecessary classes	2022-01-08 10:29:36 -05:00
Sagar Sumit	827549949c	[HUDI-2909] Handle logical type in TimestampBasedKeyGenerator (#4203 ) * [HUDI-2909] Handle logical type in TimestampBasedKeyGenerator Timestampbased key generator was returning diff values for row writer and non row writer path. this patch fixes it and is guarded by a config flag (`hoodie.datasource.write.keygenerator.consistent.logical.timestamp.enabled`)	2022-01-08 10:22:44 -05:00
Yann Byron	03a83ffeb5	[HUDI-3195] optimize spark3 pom and modify build command (#4538 )	2022-01-07 23:21:39 -08:00
董可伦	4f6cdd73a3	[HUDI-3192] Spark metastore schema evolution broken (#4533 )	2022-01-08 10:48:37 +08:00
Sagar Sumit	518488c633	[HUDI-3185] HoodieConfig#getBoolean should return false when default not set (#4536 ) Remove unnecessary config	2022-01-07 16:20:11 -05:00
Sivabalan Narayanan	2e561defe9	[HUDI-2947] Fixing checkpoint fetch in detlastreamer (#4485 ) * Fixing checkpoint fetch in detlastreamer * Addressing comments	2022-01-07 22:08:58 +05:30
董可伦	b1df60672b	[MINOR] fix typos in DDLExecutor (#4534 )	2022-01-07 07:59:55 -05:00
Y Ethan Guo	76a72641f1	[HUDI-3188] Update quick start guide for Kafka Connect Sink for Hudi (#4527 )	2022-01-07 07:56:08 -05:00
Raymond Xu	2467c137e4	[HUDI-3100] Add config for hive conditional sync (#4440 )	2022-01-06 23:26:35 -08:00
YueZhang	b2b23f5d3a	[HUDI-3183] Wrong result of HoodieArchivedTimeline loadInstants with TimeRangeFilter (#4521 ) Co-authored-by: yuezhang <yuezhang@freewheel.tv>	2022-01-06 21:16:29 -05:00
Thinking Chen	d7afc58d0c	[HUDI-3118] Add default HUDI_DIR in setupKafka.sh (#4460 )	2022-01-06 15:46:51 -08:00
xuzifu666	f0c2912d35	[MINOR] Remove unused methods in HoodieColumnProjectionUtils (#4408 )	2022-01-06 15:36:13 -08:00
Sivabalan Narayanan	8718c30324	[HUDI-3165] Enabling InProcessLockProvider for all multi-writer tests instead of FileSystemBasedLockProviderTestClass (#4427 )	2022-01-06 13:04:10 -05:00
Sivabalan Narayanan	2954027b92	[HUDI-52] Enabling savepoint and restore for MOR table (#4507 ) * Enabling restore for MOR table * Fixing savepoint for compaction commits in MOR	2022-01-06 21:26:08 +05:30
Sivabalan Narayanan	b6891d253f	[HUDI-44] Adding support to preserve commit metadata for compaction (#4428 )	2022-01-06 20:27:37 +05:30
hehexiaoduantui	50fa5a6aa7	Update HiveIncrementalPuller to configure filesystem (#4431 ) * Update HiveIncrementalPuller.java fix get FileSystem bug * Update HiveIncrementalPuller.java fix error * Update HiveIncrementalPuller.java fie error	2022-01-06 13:19:30 +05:30
fengli	205e48f53f	[HUDI-3132] Minor fixes for HoodieCatalog close apache/hudi#4486	2022-01-06 11:17:23 +08:00
Vinish Reddy	eee715b3ff	[HUDI-3168] Fixing null schema with empty commit in incremental relation (#4513 )	2022-01-05 11:43:10 -05:00
Sagar Sumit	75133f9942	[HUDI-3170] Do not preserve filename when preserveCommitMetadata enabled (#4512 )	2022-01-05 08:09:58 -05:00
Danny Chan	0e297c0c4c	[HUDI-3171] Sync empty table to hive metastore (#4511 )	2022-01-05 16:41:33 +08:00
Sivabalan Narayanan	a66212d204	[HUDI-2966] Closing LogRecordScanner in compactor (#4478 ) * Closing LogRecordScanner in compactor * Addressing comments	2022-01-05 10:57:18 +08:00
Nicolas Paris	37b15ff458	[HUDI-3147] Add endpoint_url to dynamodb lock provider (#4500 ) Co-authored-by: Nicolas Paris <nicolas.paris@adevinta.com>	2022-01-04 16:42:28 -05:00
Manoj Govindassamy	bf4e3d63e7	[HUDI-3141] Metadata merged log record reader - avoiding NullPointerException when records by keys (#4505 ) - HoodieMetadataMergedLogRecordReader#getRecordsByKeys() and its parent class methods are not thread safe. When multiple queries come in for gettting log records by keys, they all operate on the same log record reader instance provided by HoodieBackedTableMetadata#openReadersIfNeeded() and they trip over each other as they clear/put/get the same class memeber records. - The fix is to streamline the mutatation to class member records. Making HoodieMetadataMergedLogRecordReader#getRecordsByKeys() a synchronized method to avoid concurrent log records readers getting into NPE.	2022-01-04 16:41:33 -05:00
Sagar Sumit	aaf5727495	[HUDI-2774] Handle duplicate instants when fetching pending clustering plans (#4118 )	2022-01-04 16:32:05 -05:00
Sivabalan Narayanan	7329d229d5	Adding tests to validate different key generators (#4473 )	2022-01-04 10:48:04 +05:30
leesf	29ab6fb9ad	[HUDI-3140] Fix bulk_insert failure on Spark 3.2.0 (#4498 )	2022-01-04 09:59:59 +08:00
harshal	2b2ae34cb9	[HUDI-2558] Fixing Clustering w/ sort columns with null values fails (#4404 )	2022-01-03 12:19:43 +05:30
Raymond Xu	0273f2e65d	[MINOR] Update README.md (#4492 ) Update Spark 3 build instructions	2022-01-02 20:34:37 -08:00
YueZhang	1e2d2c437d	[HUDI-3138] Fix broken UT test for TestHiveSyncTool.testDropPartitions (#4493 ) Co-authored-by: yuezhang <yuezhang@freewheel.tv>	2022-01-02 22:43:30 -05:00
Yann Byron	fe9406dd33	[HUDI-3131] fix ctas error in spark3.1.1 (#4476 )	2022-01-02 03:06:55 -08:00
Yann Byron	1622b52c9c	[HUDI-3136] Fix merge/insert/show partitions error on Spark3.2 (#4490 )	2022-01-02 02:42:10 -08:00
leesf	188d0338c4	[HUDI-3134] Fix insert error after adding columns on Spark 3.2.0 (#4488 )	2022-01-01 17:38:14 -08:00
Aimiyoo	bfa169d808	[HUDI-3040] Fix HoodieSparkBootstrapExample error info for usage (#4341 )	2021-12-31 23:38:38 -08:00
YueZhang	ef9923fc55	[HUDI-3107]Fix HiveSyncTool drop partitions using JDBC or hivesql or hms (#4453 ) * constructDropPartitions when drop partitions using jdbc * done * done * code style * code review Co-authored-by: yuezhang <yuezhang@freewheel.tv>	2021-12-31 15:56:33 +08:00
Yuwei XIAO	2444f40a4b	[HUDI-3095] abstract partition filter logic to enable code reuse (#4454 ) * [HUDI-3095] abstract partition filter logic to enable code reuse * [HUDI-3095] address reviews	2021-12-31 11:07:52 +05:30
yuzhaojing	e88b5fd450	[HUDI-3120] Cache compactionPlan in buffer (#4463 ) Co-authored-by: yuzhaojing <yuzhaojing@bytedance.com>	2021-12-31 13:12:32 +08:00
Shawy Geng	a4e622ac61	[HUDI-1951] Add bucket hash index, compatible with the hive bucket (#3173 ) * [HUDI-2154] Add index key field to HoodieKey * [HUDI-2157] Add the bucket index and its read/write implemention of Spark engine. * revert HUDI-2154 add index key field to HoodieKey * fix all comments and introduce a new tricky way to get index key at runtime support double insert for bucket index * revert spark read optimizer based on bucket index * add the storage layout * index tag, hash function and add ut * fix ut * address partial comments * Code review feedback * add layout config and docs * fix ut * rename hoodie.layout and rebase master Co-authored-by: Vinoth Chandar <vinoth@apache.org>	2021-12-30 12:38:26 -08:00

1 2 3 4 5 ...

2339 Commits