1
0
Commit Graph

1907 Commits

Author SHA1 Message Date
Danny Chan
89651c9408 [HUDI-2421] Catch the throwable when scheduling the cleaning task for flink writer (#3650) 2021-09-13 20:43:44 +08:00
liujinhui
9f3c4a2a7f [HUDI-2410] Fix getDefaultBootstrapIndexClass logical error (#3633) 2021-09-13 16:10:17 +08:00
K.I. (Dennis) Jung
c79017cb74 [HUDI-2397] Add --enable-sync parameter (#3608)
* add meta-sync config

* update test

* keep enableMetaSync same with enableHiveSync

* Switch check logic to use `enableMetaSync`
2021-09-13 12:04:49 +05:30
Danny Chan
280f66e0f8 [MINOR] Fix the default parallelism of write task (#3649) 2021-09-13 11:41:49 +08:00
Ankush Kanungo
4f991ee352 [HUDI-2398] Collect event time for inserts in DefaultHoodieRecordPayload (#3602) 2021-09-11 20:27:40 -07:00
Danny Chan
9d5c3e5cb9 [HUDI-2415] Add more info log for flink streaming reader (#3642) 2021-09-12 10:00:17 +08:00
董可伦
6228b17a3d [MINOR] Fix typo, 'requried' corrected to 'required' (#3643) 2021-09-11 15:46:24 +08:00
董可伦
dbcf60f370 [MINOR] fix typo (#3640) 2021-09-11 15:45:49 +08:00
Danny Chan
b30c5bdaef [HUDI-2412] Add timestamp based partitioning for flink writer (#3638) 2021-09-11 13:17:16 +08:00
zhangyue19921010
06240417e9 [HUDI-2354] Fix TimelineServer error because of replacecommit archive (#3536)
* bug fixed

* done

* done

* travis fix

* code reviewed

* code review

* done

* code reviewed

Co-authored-by: yuezhang <yuezhang@freewheel.tv>
2021-09-10 21:26:04 -07:00
rmahindra123
e528dd798a [HUDI-2394] Implement Kafka Sink Protocol for Hudi for Ingesting Immutable Data (#3592)
- Fixing packaging, naming of classes
 - Use of log4j over slf4j for uniformity
- More follow-on fixes
 - Added a version to control/coordinator events.
 - Eliminated the config added to write config
 - Fixed fetching of checkpoints based on table type
 - Clean up of naming, code placement

Co-authored-by: Rajesh Mahindra <rmahindra@Rajeshs-MacBook-Pro.local>
Co-authored-by: Vinoth Chandar <vinoth@apache.org>
2021-09-10 18:20:26 -07:00
Sagar Sumit
bd1d2d4952 [MINOR] Add avro schema evolution test with (non)nullable column and with(out) default value (#3639) 2021-09-10 22:03:35 +08:00
Sagar Sumit
cf15431852 [HUDI-2393] Add yamls for large scale testing (#3594) 2021-09-10 09:02:01 -04:00
wangxianghu
44b9bc145e [HUDI-2411] Remove unnecessary method overriden and note (#3636) 2021-09-10 18:58:34 +08:00
SteNicholas
512ca42d14 [MINOR] Correct the comment for the parallelism of tasks in FlinkOptions (#3634) 2021-09-10 13:42:11 +08:00
Y Ethan Guo
56d08fbe70 [HUDI-2351] Extract common FS and IO utils for marker mechanism (#3529) 2021-09-09 14:45:28 -04:00
Raymond Xu
57c8113ee1 [HUDI-2408] Deprecate FunctionalTestHarness to avoid init DFS (#3628) 2021-09-09 11:29:04 -04:00
Wei
4abcb4f659 [MINOR] Remove unused variables (#3631) 2021-09-09 23:21:16 +08:00
liujinhui
3c4eb60913 Add the document to the PUSHGATEWAY configuration item (#3627) 2021-09-09 15:53:58 +08:00
Danny Chan
db2ab9a150 [HUDI-2403] Add metadata table listing for flink query source (#3618) 2021-09-08 14:52:39 +08:00
vinoth chandar
81acb4cafe [MINOR] Remove commenting from Github, JIRA bridge (#3620) 2021-09-07 21:54:58 -07:00
Danny Chan
cf3a2ead32 [HUDI-2401] Load archived instants for flink streaming reader (#3610) 2021-09-08 10:43:54 +08:00
vinoth chandar
ea59a7ff5f [HUDI-2080] Move to ubuntu-18.04 for Azure CI (#3409)
Update Azure CI ubuntu from 16.04 to 18.04 due to 16.04 will be removed soon

Fixed some consistently failed tests

* fix TestCOWDataSourceStorage TestMORDataSourceStorage
* reset mocks

Also update readme badge



Co-authored-by: Raymond Xu <2701446+xushiyan@users.noreply.github.com>
2021-09-07 09:44:30 -07:00
liujinhui
eb5e7eec0a MINOR_CHECKSTYLE (#3616)
Fix checkstyle
2021-09-07 18:19:39 +08:00
Raymond Xu
cf002b6918 [HUDI-2079] Make CLI command tests functional (#3601)
Make all tests in org.apache.hudi.cli.commands extend org.apache.hudi.cli.functional.CLIFunctionalTestHarness and tag as "functional".

This also resolves a blocker where DFS init consistently failed when moving to ubuntu 18.04
2021-09-06 15:53:53 -07:00
Sivabalan Narayanan
f218693f5d [MINOR] Fixing some functional tests by moving to right packages (#3596) 2021-09-06 00:07:55 -04:00
Raymond Xu
7592ddd776 [HUDI-2399] Rebalance CI jobs for shorter wait time (#3604) 2021-09-05 09:25:57 -07:00
Danny Chan
e9bf1c1186 [HUDI-2380] The default archive folder should be 'archived' (#3568) 2021-09-04 15:53:55 +08:00
Raymond Xu
073c318d9f [HUDI-1989] Disable HDFSParquetImporter related tests (#3597)
Also mark HDFSParquetImportCommand and HDFSParquetImporter as deprecated.
2021-09-03 23:08:11 -04:00
Raymond Xu
6bd3ca98d6 [HUDI-1989] Fix flakiness in TestHoodieMergeOnReadTable (#3574)
* [HUDI-1989] Refactor clustering tests for MoR table

* refactor assertion helper

* add CheckedFunction

* SparkClientFunctionalTestHarness.java

* put back original test case

* move testcases out from TestHoodieMergeOnReadTable.java

* add TestHoodieSparkMergeOnReadTableRollback.java

* use SparkClientFunctionalTestHarness

* add tag
2021-09-03 13:17:17 -07:00
Raymond Xu
11398e8480 [MINOR] Skip checkstyle and rat in Azure (#3593)
- make tests run through without being blocked by style issues
- let GitHub Actions tasks give quick feedback on build, style and other checks
2021-09-03 09:18:18 -07:00
Danny Chan
79b896f071 [HUDI-2392] Do not send partition delete record when changelog mode enabled (#3586) 2021-09-02 20:58:12 +08:00
yuzhaojing
7a1bd225ca [HUDI-2376] Add pipeline for Append mode (#3573)
Co-authored-by: 喻兆靖 <yuzhaojing@bilibili.com>
2021-09-02 16:32:40 +08:00
Shawy Geng
21fd6edfe7 [HUDI-2384] Change log file size config to long (#3577) 2021-09-02 11:14:09 +08:00
Raymond Xu
38c9b85aa8 [HUDI-2280] Use GitHub Actions to build different scala spark versions (#3556) 2021-09-01 08:51:00 -07:00
Danny Chan
f66e1ce9bf [HUDI-2379] Include the pending compaction file groups for flink (#3567)
streaming reader
2021-09-01 16:47:52 +08:00
rmahindra123
d59c8044f8 [HUDI-2378] Add configs for common and pre validate (#3564)
Co-authored-by: Rajesh Mahindra <rmahindra@Rajeshs-MacBook-Pro.local>
2021-08-30 23:28:35 -04:00
董可伦
bf5a52e51b [HUDI-2320] Add support ByteArrayDeserializer in AvroKafkaSource (#3502) 2021-08-30 10:01:15 +08:00
Danny Chan
57668d02a0 [HUDI-2371] Improvement flink streaming reader (#3552)
- Support reading empty table
- Fix filtering by partition path
- Support reading from earliest commit
2021-08-28 20:16:54 +08:00
wenningd
69cbcc9516 Merge pull request #3541 from rahil-c/rahil-c/HUDI-2359
[HUDI-2359] Add basic "hoodie_is_deleted" unit tests to TestDataSource classes
2021-08-27 16:28:51 -07:00
董可伦
562e28f079 [HUDI-2365]Optimizing overwriteField method with Objects.equals (#3542)
Optimizing overwriteField method with Objects.equals
2021-08-27 17:17:22 +08:00
mikewu
9850e90e2e [HUDI-2229] Refact HoodieFlinkStreamer to reuse the pipeline of HoodieTableSink (#3495)
Co-authored-by: mikewu <xingbo.wxb@alibaba-inc.com>
2021-08-27 10:14:04 +08:00
Satish M
55a80a817d [HUDI-2264] Refactor HoodieSparkSqlWriterSuite to add setup and teardown (#3544) 2021-08-26 10:01:48 -04:00
Danny Chan
0f39137ba8 [HUDI-2321] Use the caller classloader for ReflectionUtils (#3535)
Based on the discussion on stackoverflow:
https://stackoverflow.com/questions/1771679/difference-between-threads-context-class-loader-and-normal-classloader

The Thread.currentThread().getContextClassLoader() should never be used
because the context classloader is not immutable, user can overwrite it
when thread switches, it is also nullable.

The objection here: https://stackoverflow.com/a/36228195 says the
Thread.currentThread().getContextClassLoader() is a JDK design error
and the context classloader is never suggested to be used. The API that
needs classloader should ask the user to set up the right classloader.
2021-08-26 21:00:30 +08:00
yuzhaojing
73fdcf37df [HUDI-2368] Catch Throwable in BoundedInMemoryExecutor (#3546)
Co-authored-by: 喻兆靖 <yuzhaojing@bilibili.com>
2021-08-26 20:34:05 +08:00
pengzhiwei
cc5256a7d8 [HUDI-2357] MERGE INTO doesn't work for tables created using CTAS (#3534) 2021-08-26 16:54:41 +08:00
ayachi_nene
be57e42200 [HUDI-2366] fix too many logs (#3543) 2021-08-26 16:45:52 +08:00
Rahil Chertara
694300477f [HUDI-2359] Add basic "hoodie_is_deleted" unit tests to TestDataSource classes 2021-08-25 16:35:35 -07:00
Udit Mehrotra
486bc7dc3b [MINOR] Update DOAP with 0.9.0 Release (#3537) 2021-08-25 16:57:05 -04:00
Danny Chan
a60fab3a5c [HUDI-2352] The upgrade downgrade action of flink writer should be singleton (#3531) 2021-08-25 10:56:14 +08:00