1
0
Commit Graph

1923 Commits

Author SHA1 Message Date
董可伦
3a150ee181 [HUDI-2447] Extract common business logic & Fix typo (#3683) 2021-09-17 19:45:22 +08:00
liujinhui
61d0096088 [HUDI-2434] Make periodSeconds of GraphiteReporter configurable (#3667) 2021-09-17 19:39:55 +08:00
董可伦
8a652171cf [MINOR] Fix typo,'compatiblity' corrected to 'compatibility' (#3675) 2021-09-17 09:43:23 +08:00
vinoth chandar
57d5da68aa [HUDI-2330][HUDI-2335] Adding support for merge-on-read tables (#3679)
- Inserts go into logs, hashed by Kafka and Hudi partitions
 - Fixed issues with the setupKafka script
 - Bumped up the default commit interval to 300 seconds
 - Minor renaming
2021-09-16 15:24:34 -07:00
Sivabalan Narayanan
b8dad628e5 [HUDI-2422] Adding rollback plan and rollback requested instant (#3651)
- This patch introduces rollback plan and rollback.requested instant. Rollback will be done in two phases, namely rollback plan and rollback action. In planning, we prepare the rollback plan and serialize it to rollback.requested. In the rollback action phase, we fetch details from the plan and just delete the files as per the plan. This will ensure final rollback commit metadata will contain all files that got rolled back even if rollback failed midway and retried again.
2021-09-16 11:16:06 -04:00
Sarah Witt
4deaa30c8d [HUDI-2404] Add metrics-jmx to spark and flink bundles (#3632) 2021-09-16 09:53:16 -04:00
liujinhui
2791fb9a96 [HUDI-2423] Separate some config logic from HoodieMetricsConfig into HoodieMetricsGraphiteConfig HoodieMetricsJmxConfig (#3652) 2021-09-16 15:08:10 +08:00
zhangyue19921010
2d5ac55195 [HUDI-2355][Bug]Archive service executed after cleaner finished. (#3545)
Co-authored-by: yuezhang <yuezhang@freewheel.tv>
2021-09-15 19:00:04 -04:00
Y Ethan Guo
916f12b7dd [HUDI-2433] Refactor rollback actions in hudi-client module (#3664) 2021-09-15 18:52:43 -04:00
liujinhui
86a7351c39 [MINOR] Delete Redundant code (#3661) 2021-09-15 14:46:11 +08:00
liujinhui
76554aa31a [MINOR] Add document for DataSourceReadOptions (#3653) 2021-09-15 14:33:43 +08:00
Danny Chan
627f20f9c5 [HUDI-2430] Make decimal compatible with hudi for flink writer (#3658) 2021-09-15 12:04:46 +08:00
Vinoth Chandar
d90fd1f68c [MINOR] Update Kafka connect sink readme 2021-09-14 10:36:37 -07:00
rmahindra123
9735f4b8ef [HUDI-2428] Fix protocol and other issues after stress testing Hudi Kafka Connect (#3656)
* Fixes based on tests and some improvements
* Fix the issues after running stress tests
* Fixing checkstyle issues and updating README

Co-authored-by: Rajesh Mahindra <rmahindra@Rajeshs-MacBook-Pro.local>
Co-authored-by: Vinoth Chandar <vinoth@apache.org>
2021-09-14 07:14:58 -07:00
Y Ethan Guo
5d60491f5b [HUDI-2388] Add DAG nodes for Spark SQL in integration test suite (#3583)
- Fixed validation in integ test suite for both deltastreamer and write client path. 

Co-authored-by: Sivabalan Narayanan <n.siva.b@gmail.com>
2021-09-13 11:53:13 -04:00
liujinhui
35a04c43a5 [HUDI-2425] TestHoodieMultiTableDeltaStreamer CI failed due to exception (#3654) 2021-09-13 06:57:04 -07:00
Danny Chan
89651c9408 [HUDI-2421] Catch the throwable when scheduling the cleaning task for flink writer (#3650) 2021-09-13 20:43:44 +08:00
liujinhui
9f3c4a2a7f [HUDI-2410] Fix getDefaultBootstrapIndexClass logical error (#3633) 2021-09-13 16:10:17 +08:00
K.I. (Dennis) Jung
c79017cb74 [HUDI-2397] Add --enable-sync parameter (#3608)
* add meta-sync config

* update test

* keep enableMetaSync same with enableHiveSync

* Switch check logic to use `enableMetaSync`
2021-09-13 12:04:49 +05:30
Danny Chan
280f66e0f8 [MINOR] Fix the default parallelism of write task (#3649) 2021-09-13 11:41:49 +08:00
Ankush Kanungo
4f991ee352 [HUDI-2398] Collect event time for inserts in DefaultHoodieRecordPayload (#3602) 2021-09-11 20:27:40 -07:00
Danny Chan
9d5c3e5cb9 [HUDI-2415] Add more info log for flink streaming reader (#3642) 2021-09-12 10:00:17 +08:00
董可伦
6228b17a3d [MINOR] Fix typo, 'requried' corrected to 'required' (#3643) 2021-09-11 15:46:24 +08:00
董可伦
dbcf60f370 [MINOR] fix typo (#3640) 2021-09-11 15:45:49 +08:00
Danny Chan
b30c5bdaef [HUDI-2412] Add timestamp based partitioning for flink writer (#3638) 2021-09-11 13:17:16 +08:00
zhangyue19921010
06240417e9 [HUDI-2354] Fix TimelineServer error because of replacecommit archive (#3536)
* bug fixed

* done

* done

* travis fix

* code reviewed

* code review

* done

* code reviewed

Co-authored-by: yuezhang <yuezhang@freewheel.tv>
2021-09-10 21:26:04 -07:00
rmahindra123
e528dd798a [HUDI-2394] Implement Kafka Sink Protocol for Hudi for Ingesting Immutable Data (#3592)
- Fixing packaging, naming of classes
 - Use of log4j over slf4j for uniformity
- More follow-on fixes
 - Added a version to control/coordinator events.
 - Eliminated the config added to write config
 - Fixed fetching of checkpoints based on table type
 - Clean up of naming, code placement

Co-authored-by: Rajesh Mahindra <rmahindra@Rajeshs-MacBook-Pro.local>
Co-authored-by: Vinoth Chandar <vinoth@apache.org>
2021-09-10 18:20:26 -07:00
Sagar Sumit
bd1d2d4952 [MINOR] Add avro schema evolution test with (non)nullable column and with(out) default value (#3639) 2021-09-10 22:03:35 +08:00
Sagar Sumit
cf15431852 [HUDI-2393] Add yamls for large scale testing (#3594) 2021-09-10 09:02:01 -04:00
wangxianghu
44b9bc145e [HUDI-2411] Remove unnecessary method overriden and note (#3636) 2021-09-10 18:58:34 +08:00
SteNicholas
512ca42d14 [MINOR] Correct the comment for the parallelism of tasks in FlinkOptions (#3634) 2021-09-10 13:42:11 +08:00
Y Ethan Guo
56d08fbe70 [HUDI-2351] Extract common FS and IO utils for marker mechanism (#3529) 2021-09-09 14:45:28 -04:00
Raymond Xu
57c8113ee1 [HUDI-2408] Deprecate FunctionalTestHarness to avoid init DFS (#3628) 2021-09-09 11:29:04 -04:00
Wei
4abcb4f659 [MINOR] Remove unused variables (#3631) 2021-09-09 23:21:16 +08:00
liujinhui
3c4eb60913 Add the document to the PUSHGATEWAY configuration item (#3627) 2021-09-09 15:53:58 +08:00
Danny Chan
db2ab9a150 [HUDI-2403] Add metadata table listing for flink query source (#3618) 2021-09-08 14:52:39 +08:00
vinoth chandar
81acb4cafe [MINOR] Remove commenting from Github, JIRA bridge (#3620) 2021-09-07 21:54:58 -07:00
Danny Chan
cf3a2ead32 [HUDI-2401] Load archived instants for flink streaming reader (#3610) 2021-09-08 10:43:54 +08:00
vinoth chandar
ea59a7ff5f [HUDI-2080] Move to ubuntu-18.04 for Azure CI (#3409)
Update Azure CI ubuntu from 16.04 to 18.04 due to 16.04 will be removed soon

Fixed some consistently failed tests

* fix TestCOWDataSourceStorage TestMORDataSourceStorage
* reset mocks

Also update readme badge



Co-authored-by: Raymond Xu <2701446+xushiyan@users.noreply.github.com>
2021-09-07 09:44:30 -07:00
liujinhui
eb5e7eec0a MINOR_CHECKSTYLE (#3616)
Fix checkstyle
2021-09-07 18:19:39 +08:00
Raymond Xu
cf002b6918 [HUDI-2079] Make CLI command tests functional (#3601)
Make all tests in org.apache.hudi.cli.commands extend org.apache.hudi.cli.functional.CLIFunctionalTestHarness and tag as "functional".

This also resolves a blocker where DFS init consistently failed when moving to ubuntu 18.04
2021-09-06 15:53:53 -07:00
Sivabalan Narayanan
f218693f5d [MINOR] Fixing some functional tests by moving to right packages (#3596) 2021-09-06 00:07:55 -04:00
Raymond Xu
7592ddd776 [HUDI-2399] Rebalance CI jobs for shorter wait time (#3604) 2021-09-05 09:25:57 -07:00
Danny Chan
e9bf1c1186 [HUDI-2380] The default archive folder should be 'archived' (#3568) 2021-09-04 15:53:55 +08:00
Raymond Xu
073c318d9f [HUDI-1989] Disable HDFSParquetImporter related tests (#3597)
Also mark HDFSParquetImportCommand and HDFSParquetImporter as deprecated.
2021-09-03 23:08:11 -04:00
Raymond Xu
6bd3ca98d6 [HUDI-1989] Fix flakiness in TestHoodieMergeOnReadTable (#3574)
* [HUDI-1989] Refactor clustering tests for MoR table

* refactor assertion helper

* add CheckedFunction

* SparkClientFunctionalTestHarness.java

* put back original test case

* move testcases out from TestHoodieMergeOnReadTable.java

* add TestHoodieSparkMergeOnReadTableRollback.java

* use SparkClientFunctionalTestHarness

* add tag
2021-09-03 13:17:17 -07:00
Raymond Xu
11398e8480 [MINOR] Skip checkstyle and rat in Azure (#3593)
- make tests run through without being blocked by style issues
- let GitHub Actions tasks give quick feedback on build, style and other checks
2021-09-03 09:18:18 -07:00
Danny Chan
79b896f071 [HUDI-2392] Do not send partition delete record when changelog mode enabled (#3586) 2021-09-02 20:58:12 +08:00
yuzhaojing
7a1bd225ca [HUDI-2376] Add pipeline for Append mode (#3573)
Co-authored-by: 喻兆靖 <yuzhaojing@bilibili.com>
2021-09-02 16:32:40 +08:00
Shawy Geng
21fd6edfe7 [HUDI-2384] Change log file size config to long (#3577) 2021-09-02 11:14:09 +08:00