Vinoth Chandar
d90fd1f68c
[MINOR] Update Kafka connect sink readme
2021-09-14 10:36:37 -07:00
rmahindra123
9735f4b8ef
[HUDI-2428] Fix protocol and other issues after stress testing Hudi Kafka Connect ( #3656 )
...
* Fixes based on tests and some improvements
* Fix the issues after running stress tests
* Fixing checkstyle issues and updating README
Co-authored-by: Rajesh Mahindra <rmahindra@Rajeshs-MacBook-Pro.local >
Co-authored-by: Vinoth Chandar <vinoth@apache.org >
2021-09-14 07:14:58 -07:00
Y Ethan Guo
5d60491f5b
[HUDI-2388] Add DAG nodes for Spark SQL in integration test suite ( #3583 )
...
- Fixed validation in integ test suite for both deltastreamer and write client path.
Co-authored-by: Sivabalan Narayanan <n.siva.b@gmail.com >
2021-09-13 11:53:13 -04:00
liujinhui
35a04c43a5
[HUDI-2425] TestHoodieMultiTableDeltaStreamer CI failed due to exception ( #3654 )
2021-09-13 06:57:04 -07:00
Danny Chan
89651c9408
[HUDI-2421] Catch the throwable when scheduling the cleaning task for flink writer ( #3650 )
2021-09-13 20:43:44 +08:00
liujinhui
9f3c4a2a7f
[HUDI-2410] Fix getDefaultBootstrapIndexClass logical error ( #3633 )
2021-09-13 16:10:17 +08:00
K.I. (Dennis) Jung
c79017cb74
[HUDI-2397] Add --enable-sync parameter ( #3608 )
...
* add meta-sync config
* update test
* keep enableMetaSync same with enableHiveSync
* Switch check logic to use `enableMetaSync`
2021-09-13 12:04:49 +05:30
Danny Chan
280f66e0f8
[MINOR] Fix the default parallelism of write task ( #3649 )
2021-09-13 11:41:49 +08:00
Ankush Kanungo
4f991ee352
[HUDI-2398] Collect event time for inserts in DefaultHoodieRecordPayload ( #3602 )
2021-09-11 20:27:40 -07:00
Danny Chan
9d5c3e5cb9
[HUDI-2415] Add more info log for flink streaming reader ( #3642 )
2021-09-12 10:00:17 +08:00
董可伦
6228b17a3d
[MINOR] Fix typo, 'requried' corrected to 'required' ( #3643 )
2021-09-11 15:46:24 +08:00
董可伦
dbcf60f370
[MINOR] fix typo ( #3640 )
2021-09-11 15:45:49 +08:00
Danny Chan
b30c5bdaef
[HUDI-2412] Add timestamp based partitioning for flink writer ( #3638 )
2021-09-11 13:17:16 +08:00
zhangyue19921010
06240417e9
[HUDI-2354] Fix TimelineServer error because of replacecommit archive ( #3536 )
...
* bug fixed
* done
* done
* travis fix
* code reviewed
* code review
* done
* code reviewed
Co-authored-by: yuezhang <yuezhang@freewheel.tv >
2021-09-10 21:26:04 -07:00
rmahindra123
e528dd798a
[HUDI-2394] Implement Kafka Sink Protocol for Hudi for Ingesting Immutable Data ( #3592 )
...
- Fixing packaging, naming of classes
- Use of log4j over slf4j for uniformity
- More follow-on fixes
- Added a version to control/coordinator events.
- Eliminated the config added to write config
- Fixed fetching of checkpoints based on table type
- Clean up of naming, code placement
Co-authored-by: Rajesh Mahindra <rmahindra@Rajeshs-MacBook-Pro.local >
Co-authored-by: Vinoth Chandar <vinoth@apache.org >
2021-09-10 18:20:26 -07:00
Sagar Sumit
bd1d2d4952
[MINOR] Add avro schema evolution test with (non)nullable column and with(out) default value ( #3639 )
2021-09-10 22:03:35 +08:00
Sagar Sumit
cf15431852
[HUDI-2393] Add yamls for large scale testing ( #3594 )
2021-09-10 09:02:01 -04:00
wangxianghu
44b9bc145e
[HUDI-2411] Remove unnecessary method overriden and note ( #3636 )
2021-09-10 18:58:34 +08:00
SteNicholas
512ca42d14
[MINOR] Correct the comment for the parallelism of tasks in FlinkOptions ( #3634 )
2021-09-10 13:42:11 +08:00
Y Ethan Guo
56d08fbe70
[HUDI-2351] Extract common FS and IO utils for marker mechanism ( #3529 )
2021-09-09 14:45:28 -04:00
Raymond Xu
57c8113ee1
[HUDI-2408] Deprecate FunctionalTestHarness to avoid init DFS ( #3628 )
2021-09-09 11:29:04 -04:00
Wei
4abcb4f659
[MINOR] Remove unused variables ( #3631 )
2021-09-09 23:21:16 +08:00
liujinhui
3c4eb60913
Add the document to the PUSHGATEWAY configuration item ( #3627 )
2021-09-09 15:53:58 +08:00
Danny Chan
db2ab9a150
[HUDI-2403] Add metadata table listing for flink query source ( #3618 )
2021-09-08 14:52:39 +08:00
vinoth chandar
81acb4cafe
[MINOR] Remove commenting from Github, JIRA bridge ( #3620 )
2021-09-07 21:54:58 -07:00
Danny Chan
cf3a2ead32
[HUDI-2401] Load archived instants for flink streaming reader ( #3610 )
2021-09-08 10:43:54 +08:00
vinoth chandar
ea59a7ff5f
[HUDI-2080] Move to ubuntu-18.04 for Azure CI ( #3409 )
...
Update Azure CI ubuntu from 16.04 to 18.04 due to 16.04 will be removed soon
Fixed some consistently failed tests
* fix TestCOWDataSourceStorage TestMORDataSourceStorage
* reset mocks
Also update readme badge
Co-authored-by: Raymond Xu <2701446+xushiyan@users.noreply.github.com >
2021-09-07 09:44:30 -07:00
liujinhui
eb5e7eec0a
MINOR_CHECKSTYLE ( #3616 )
...
Fix checkstyle
2021-09-07 18:19:39 +08:00
Raymond Xu
cf002b6918
[HUDI-2079] Make CLI command tests functional ( #3601 )
...
Make all tests in org.apache.hudi.cli.commands extend org.apache.hudi.cli.functional.CLIFunctionalTestHarness and tag as "functional".
This also resolves a blocker where DFS init consistently failed when moving to ubuntu 18.04
2021-09-06 15:53:53 -07:00
Sivabalan Narayanan
f218693f5d
[MINOR] Fixing some functional tests by moving to right packages ( #3596 )
2021-09-06 00:07:55 -04:00
Raymond Xu
7592ddd776
[HUDI-2399] Rebalance CI jobs for shorter wait time ( #3604 )
2021-09-05 09:25:57 -07:00
Danny Chan
e9bf1c1186
[HUDI-2380] The default archive folder should be 'archived' ( #3568 )
2021-09-04 15:53:55 +08:00
Raymond Xu
073c318d9f
[HUDI-1989] Disable HDFSParquetImporter related tests ( #3597 )
...
Also mark HDFSParquetImportCommand and HDFSParquetImporter as deprecated.
2021-09-03 23:08:11 -04:00
Raymond Xu
6bd3ca98d6
[HUDI-1989] Fix flakiness in TestHoodieMergeOnReadTable ( #3574 )
...
* [HUDI-1989] Refactor clustering tests for MoR table
* refactor assertion helper
* add CheckedFunction
* SparkClientFunctionalTestHarness.java
* put back original test case
* move testcases out from TestHoodieMergeOnReadTable.java
* add TestHoodieSparkMergeOnReadTableRollback.java
* use SparkClientFunctionalTestHarness
* add tag
2021-09-03 13:17:17 -07:00
Raymond Xu
11398e8480
[MINOR] Skip checkstyle and rat in Azure ( #3593 )
...
- make tests run through without being blocked by style issues
- let GitHub Actions tasks give quick feedback on build, style and other checks
2021-09-03 09:18:18 -07:00
Danny Chan
79b896f071
[HUDI-2392] Do not send partition delete record when changelog mode enabled ( #3586 )
2021-09-02 20:58:12 +08:00
yuzhaojing
7a1bd225ca
[HUDI-2376] Add pipeline for Append mode ( #3573 )
...
Co-authored-by: 喻兆靖 <yuzhaojing@bilibili.com >
2021-09-02 16:32:40 +08:00
Shawy Geng
21fd6edfe7
[HUDI-2384] Change log file size config to long ( #3577 )
2021-09-02 11:14:09 +08:00
Raymond Xu
38c9b85aa8
[HUDI-2280] Use GitHub Actions to build different scala spark versions ( #3556 )
2021-09-01 08:51:00 -07:00
Danny Chan
f66e1ce9bf
[HUDI-2379] Include the pending compaction file groups for flink ( #3567 )
...
streaming reader
2021-09-01 16:47:52 +08:00
rmahindra123
d59c8044f8
[HUDI-2378] Add configs for common and pre validate ( #3564 )
...
Co-authored-by: Rajesh Mahindra <rmahindra@Rajeshs-MacBook-Pro.local >
2021-08-30 23:28:35 -04:00
董可伦
bf5a52e51b
[HUDI-2320] Add support ByteArrayDeserializer in AvroKafkaSource ( #3502 )
2021-08-30 10:01:15 +08:00
Danny Chan
57668d02a0
[HUDI-2371] Improvement flink streaming reader ( #3552 )
...
- Support reading empty table
- Fix filtering by partition path
- Support reading from earliest commit
2021-08-28 20:16:54 +08:00
wenningd
69cbcc9516
Merge pull request #3541 from rahil-c/rahil-c/HUDI-2359
...
[HUDI-2359] Add basic "hoodie_is_deleted" unit tests to TestDataSource classes
2021-08-27 16:28:51 -07:00
董可伦
562e28f079
[HUDI-2365]Optimizing overwriteField method with Objects.equals ( #3542 )
...
Optimizing overwriteField method with Objects.equals
2021-08-27 17:17:22 +08:00
mikewu
9850e90e2e
[HUDI-2229] Refact HoodieFlinkStreamer to reuse the pipeline of HoodieTableSink ( #3495 )
...
Co-authored-by: mikewu <xingbo.wxb@alibaba-inc.com >
2021-08-27 10:14:04 +08:00
Satish M
55a80a817d
[HUDI-2264] Refactor HoodieSparkSqlWriterSuite to add setup and teardown ( #3544 )
2021-08-26 10:01:48 -04:00
Danny Chan
0f39137ba8
[HUDI-2321] Use the caller classloader for ReflectionUtils ( #3535 )
...
Based on the discussion on stackoverflow:
https://stackoverflow.com/questions/1771679/difference-between-threads-context-class-loader-and-normal-classloader
The Thread.currentThread().getContextClassLoader() should never be used
because the context classloader is not immutable, user can overwrite it
when thread switches, it is also nullable.
The objection here: https://stackoverflow.com/a/36228195 says the
Thread.currentThread().getContextClassLoader() is a JDK design error
and the context classloader is never suggested to be used. The API that
needs classloader should ask the user to set up the right classloader.
2021-08-26 21:00:30 +08:00
yuzhaojing
73fdcf37df
[HUDI-2368] Catch Throwable in BoundedInMemoryExecutor ( #3546 )
...
Co-authored-by: 喻兆靖 <yuzhaojing@bilibili.com >
2021-08-26 20:34:05 +08:00
pengzhiwei
cc5256a7d8
[HUDI-2357] MERGE INTO doesn't work for tables created using CTAS ( #3534 )
2021-08-26 16:54:41 +08:00