Raymond Xu
2ada2ef50f
[HUDI-902] Avoid exception when getSchemaProvider ( #1584 )
...
* When no new input data, don't throw exception for null SchemaProvider
* Return the newly added NullSchemaProvider instead
2020-05-15 21:33:02 -07:00
Alexander Filipchik
25e0b75b3d
[HUDI-723] Register avro schema if infered from SQL transformation ( #1518 )
...
* Register avro schema if infered from SQL transformation
* Make HoodieWriteClient creation done lazily always. Handle setting schema-provider and avro-schemas correctly when using SQL transformer
Co-authored-by: Alex Filipchik <alex.filipchik@csscompany.com >
Co-authored-by: Balaji Varadarajan <varadarb@uber.com >
2020-05-15 12:44:03 -07:00
Alexander Filipchik
f094f42857
[HUDI-843] Add ability to specify time unit for TimestampBasedKeyGenerator ( #1541 )
...
Co-authored-by: Alex Filipchik <alex.filipchik@csscompany.com >
Co-authored-by: Vinoth Chandar <vinoth@apache.org >
2020-05-14 13:37:59 -07:00
hongdd
3a2fe13fcb
[HUDI-701] Add unit test for HDFSParquetImportCommand ( #1574 )
2020-05-14 19:15:49 +08:00
Raymond Xu
0d4848b68b
[HUDI-811] Restructure test packages ( #1607 )
...
* restructure hudi-spark tests
* restructure hudi-timeline-service tests
* restructure hudi-hadoop-mr hudi-utilities tests
* restructure hudi-hive-sync tests
2020-05-13 15:37:03 -07:00
liujinhui
5d37e66b7e
[MINOR] Fix HoodieNotSupportedException description in KafkaOffsetGen ( #1615 )
2020-05-11 23:14:43 +08:00
Raymond Xu
366bb10d8c
[HUDI-812] Migrate hudi common tests to JUnit 5 ( #1590 )
...
* [HUDI-812] Migrate hudi-common tests to JUnit 5
2020-05-06 19:15:20 +08:00
Raymond Xu
096f7f55b2
[HUDI-813] Migrate hudi-utilities tests to JUnit 5 ( #1589 )
2020-05-04 12:43:42 +08:00
Balaji Varadarajan
506447fd4f
[HUDI-850] Avoid unnecessary listings in incremental cleaning mode ( #1576 )
2020-05-01 21:37:21 -07:00
vinoth chandar
c4b71622b9
[MINOR] Reorder HoodieTimeline#compareTimestamp arguments for better readability ( #1575 )
...
- reads nicely as (instantTime1, GREATER_THAN_OR_EQUALS, instantTime2) etc
2020-04-30 09:19:39 -07:00
Raymond Xu
06dae30297
[HUDI-810] Migrate ClientTestHarness to JUnit 5 ( #1553 )
2020-04-28 23:38:16 +08:00
dengziming
19cc15c098
[MINOR]: Fix cli docs for DeltaStreamer ( #1547 )
2020-04-22 11:37:17 -07:00
Raymond Xu
6e15eebd81
[HUDI-809] Migrate CommonTestHarness to JUnit 5 ( #1530 )
2020-04-22 14:10:25 +08:00
Alexander Filipchik
2a56f82908
[HUDI-821] Fixing JCommander param parsing in deltastreamer ( #1525 )
...
Co-authored-by: Alex Filipchik <alex.filipchik@csscompany.com >
2020-04-21 20:12:34 -07:00
hongdd
84dd9047d3
[HUDI-789]Adjust logic of upsert in HDFSParquetImporter ( #1511 )
2020-04-21 14:21:30 +08:00
Alexander Filipchik
acb1ada2f7
[HUDI-799] Use appropriate FS when loading configs ( #1517 )
...
Co-authored-by: Alex Filipchik <alex.filipchik@csscompany.com >
2020-04-16 13:49:39 -07:00
Raymond Xu
acdc4a8d00
[HUDI-798] Migrate to Mockito Jupiter for JUnit 5 ( #1521 )
2020-04-16 16:07:32 +08:00
Iftach Schonbaum
9ca710cb02
[HUDI-777] Updated description for --target-table parameter ( #1519 )
2020-04-15 14:56:13 -07:00
Raymond Xu
d65efe659d
[HUDI-780] Migrate test cases to Junit 5 ( #1504 )
2020-04-15 12:35:01 -07:00
Gary Li
14d4fea833
[HUDI-759] Integrate checkpoint provider with delta streamer ( #1486 )
2020-04-14 14:51:04 -07:00
Bhavani Sudha Saktheeswaran
8c7cef3e50
[HUDI - 738] Add validation to DeltaStreamer to fail fast when filterDupes is enabled on UPSERT mode. ( #1505 )
...
Summary:
This fix ensures for UPSERT operation, '--filter-dupes' is disabled and fails fast if not. Otherwise it would drop all updates silently and only take in new records.
2020-04-10 08:58:55 -07:00
Pratyaksh Sharma
d610252d6b
[HUDI-288]: Add support for ingesting multiple kafka streams in a single DeltaStreamer deployment ( #1150 )
...
* [HUDI-288]: Add support for ingesting multiple kafka streams in a single DeltaStreamer deployment
2020-04-07 16:10:26 -07:00
YanJia-Gary-Li
575d87cf7d
HUDI-644 kafka connect checkpoint provider ( #1453 )
2020-04-03 18:57:34 -07:00
Ramachandran Madtas Subramaniam
639ec20412
[HUDI-562] Enable testing at debug log level
...
This is to ensure that tests will execute all code paths, even the ones
written under DEBUG log levels. This will improve coverage as well as
ensure there are no surprised when DEBUG log level is enabled in
production.
2020-04-02 11:14:35 -07:00
Raymond Xu
5b53b0d85e
[HUDI-731] Add ChainedTransformer ( #1440 )
...
* [HUDI-731] Add ChainedTransformer
2020-04-01 23:21:31 +08:00
Shaofeng Shi
78b3194e82
[HUDI-751] Fix some coding issues reported by FindBugs ( #1470 )
2020-03-31 21:19:32 +08:00
wenningd
ce0a4c64d0
[HUDI-713] Fix conversion of Spark array of struct type to Avro schema ( #1406 )
...
Co-authored-by: Wenning Ding <wenningd@amazon.com >
2020-03-30 15:52:15 -07:00
Suneel Marthi
fa36082554
[HUDI-746] Reduce build warnings < 10 ( #1465 )
2020-03-30 11:46:52 +08:00
vinoth chandar
e057c27603
[HUDI-744] Restructure hudi-common and clean up files under util packages ( #1462 )
...
- Brings more order and cohesion to the classes in hudi-common
- Utils classes related to a particular concept (avro, timeline,...) are placed near to the package
- common.fs package now contains all the filesystem level classes including wrapper filesystem
- bloom.filter package renamed to just bloom
- config package contains classes that help store properties
- common.fs.inline package contains all the inline filesystem classes/impl
- common.table.timeline now consolidates all timeline related classes
- common.table.view consolidates all the classes related to filesystem view metadata
- common.table.timeline.versioning contains all classes related to versioning of timeline
- Fix few unit tests as a result
- Moved the test packages around to match the source file move
- Rename AvroUtils to TimelineMetadataUtils & minor fixes/typos
2020-03-29 10:58:49 -07:00
Sivabalan Narayanan
ac73bdcdc3
[HUDI-430] Adding InlineFileSystem to support embedding any file format as an InlineFile ( #1176 )
...
* Adding InlineFileSystem to support embedding any file format (parquet, hfile, etc). Supports reading the embedded file using respective readers.
2020-03-28 12:13:35 -04:00
Suneel Marthi
8c3001363d
HUDI-479: Eliminate or Minimize use of Guava if possible ( #1159 )
2020-03-28 03:11:32 -04:00
Raymond Xu
bc82e2be6c
[HUDI-711] Refactor exporter main logic ( #1436 )
...
* Refactor exporter main logic
* break main method into multiple readable methods
* fix bug of passing wrong file list
* avoid deleting output path when exists
* throw exception to early abort on multiple cases
* use JavaSparkContext instead of SparkSession
* improve unit test for expected exceptions
2020-03-25 18:02:24 +08:00
Zhiyuan Zhao
0241b21f77
[HUDI-65] commitTime rename to instantTime ( #1431 )
2020-03-22 18:06:00 -07:00
lamber-ken
38c3ccc51a
[HUDI-663] Fix HoodieDeltaStreamer offset not handled correctly ( #1377 )
2020-03-22 10:31:48 -07:00
Pratyaksh Sharma
1e1d9e1d34
[HUDI-616] Fixed parquet files getting created on local FS ( #1434 )
2020-03-22 22:19:47 +08:00
Zhiyuan Zhao
06652aa935
[MINOR] Add omissive param desc on method doc and cleanup redundant code ( #1437 )
2020-03-22 21:39:33 +08:00
Zhiyuan Zhao
8b00791ef4
[MINOR] cleanup redundant comment and unused variable and fix typo ( #1435 )
2020-03-21 20:12:06 -07:00
Mathieu
eeab532d79
[HUDI-725] Remove init log in the constructor of DeltaSync ( #1425 )
2020-03-20 17:47:59 +08:00
Mathieu
21c45e1051
[HUDI-726]Delete unused method in HoodieDeltaStreamer ( #1426 )
2020-03-20 17:44:16 +08:00
Sivabalan Narayanan
a752b7b18c
Merge pull request #1165 from yihua/HUDI-76-deltastreamer-csv-source
...
[HUDI-76] Add CSV Source support for Hudi Delta Streamer
2020-03-19 10:00:53 -04:00
Raymond Xu
779edc0688
[HUDI-344] Add partitioner param to Exporter ( #1405 )
2020-03-18 19:24:04 +08:00
Y Ethan Guo
cf765df606
[HUDI-76] Add CSV Source support for Hudi Delta Streamer
2020-03-15 19:03:37 -07:00
Raymond Xu
14323cb100
[HUDI-344] Improve exporter tests ( #1404 )
2020-03-15 20:24:30 +08:00
Suneel Marthi
99b7e9eb9e
[HUDI-629]: Replace Guava's Hashing with an equivalent in NumericUtils.java ( #1350 )
...
* [HUDI-629]: Replace Guava's Hashing with an equivalent in NumericUtils.java
2020-03-13 20:28:05 -04:00
Sivabalan Narayanan
1ca912af09
[HUDI-667] Fixing delete tests for DeltaStreamer ( #1395 )
2020-03-11 16:19:23 -07:00
openopen2
44700d531a
[HUDI-344] Hudi Dataset Snapshot Exporter ( #1360 )
...
Co-authored-by: jason1993 <261049174@qq.com >
2020-03-10 09:17:51 +08:00
hongdd
f93e64fee4
[HUDI-681]Remove embeddedTimelineService from HoodieReadClient ( #1388 )
...
* [HUDI-681]Remove embeddedTimelineService from HoodieReadClient
2020-03-09 18:31:04 +08:00
lamber-ken
ccbf543607
[HUDI-654] Rename hudi-hive to hudi-hive-sync
2020-03-06 22:13:16 +08:00
yanghua
0dc8e493aa
Moving to 0.6.0-SNAPSHOT on master branch.
2020-03-01 15:08:30 +08:00
vinoth chandar
71170fafe7
[HUDI-554] Cleanup package structure in hudi-client ( #1346 )
...
- Just package, class moves and renames with the following intent
- `client` now has all the various client classes, that do the transaction management
- `func` renamed to `execution` and some helpers moved to `client/utils`
- All compaction code under `io` now under `table/compact`
- Rollback code under `table/rollback` and in general all code for individual operations under `table`
- `exception` `config`, `metrics` left untouched
- Moved the tests also accordingly
- Fixed some flaky tests
2020-02-27 08:05:58 -08:00