Udit Mehrotra
404c7e82d9
[HUDI-884] Shade avro and parquet-avro in hudi-hive-sync-bundle ( #1618 )
...
Co-authored-by: Mehrotra <uditme@amazon.com >
2020-05-12 11:40:31 -07:00
Shen Hong
e8ffc6f0aa
[HUDI-881] Replace part of spark context by hadoop configuration in AbstractHoodieClient and HoodieReadClient ( #1620 )
2020-05-12 09:33:29 -07:00
Shen Hong
b54517aad0
[HUDI-886] Replace jsc.hadoopConfiguration by hadoop configuration in hudi-client testcase ( #1621 )
2020-05-12 08:51:31 -07:00
Shen Hong
295d00beea
[HUDI-880] Replace part of spark context by hadoop configuration in HoodieTable. ( #1614 )
2020-05-11 23:33:57 -07:00
liujinhui
5d37e66b7e
[MINOR] Fix HoodieNotSupportedException description in KafkaOffsetGen ( #1615 )
2020-05-11 23:14:43 +08:00
Shen Hong
6dac10115c
[HUDI-870] Remove spark context in ClientUtils and HoodieIndex ( #1609 )
2020-05-11 19:05:36 +08:00
Balaji Varadarajan
8d0e23173b
[HUDI-820] cleaner repair command should only inspect clean metadata files ( #1542 )
2020-05-11 09:25:54 +08:00
vinoth chandar
f92b9fdcc4
[MINOR] Fix hardcoding of ports in TestHoodieJmxMetrics ( #1606 )
2020-05-10 19:23:26 -04:00
Carm
fa6aba751d
[MINOR] fixed building IndexFileFilter with a wrong condition in HoodieGlobalBloomIndex class ( #1537 )
2020-05-10 09:45:07 +08:00
Udit Mehrotra
d54b4b8a52
[HUDI-838] Support schema from HoodieCommitMetadata for HiveSync ( #1559 )
...
Co-authored-by: Mehrotra <uditme@amazon.com >
2020-05-07 16:33:09 -07:00
Alexander Filipchik
e783ab1749
[HUDI-784] Adressing issue with log reader on GCS ( #1516 )
...
[HUDI-784] Adressing issue with log reader on GCS (#1516 )
Co-authored-by: Alex Filipchik <alex.filipchik@csscompany.com >
2020-05-07 13:05:32 -07:00
hongdd
f921469afc
[HUDI-704] Add test for RepairsCommand ( #1554 )
2020-05-07 23:02:28 +08:00
Raymond Xu
366bb10d8c
[HUDI-812] Migrate hudi common tests to JUnit 5 ( #1590 )
...
* [HUDI-812] Migrate hudi-common tests to JUnit 5
2020-05-06 19:15:20 +08:00
bschell
e21441ad83
Add changes for presto mor queries ( #1578 )
...
Adds the neccessary changes to hudi for support of presto querying hudi
merge-on-read table's realtime view.
Co-authored-by: Brandon Scheller <bschelle@amazon.com >
2020-05-04 11:27:14 -07:00
AakashPradeep
5e0f5e5521
[HUDI-852] adding check for table name for Append Save mode ( #1580 )
...
* adding check for table name for Append Save mode
* adding existing table validation for delete and upsert operation
Co-authored-by: Aakash Pradeep <apradeep@twilio.com >
2020-05-03 23:09:17 -07:00
Raymond Xu
096f7f55b2
[HUDI-813] Migrate hudi-utilities tests to JUnit 5 ( #1589 )
2020-05-04 12:43:42 +08:00
Balaji Varadarajan
506447fd4f
[HUDI-850] Avoid unnecessary listings in incremental cleaning mode ( #1576 )
2020-05-01 21:37:21 -07:00
vinoth chandar
c4b71622b9
[MINOR] Reorder HoodieTimeline#compareTimestamp arguments for better readability ( #1575 )
...
- reads nicely as (instantTime1, GREATER_THAN_OR_EQUALS, instantTime2) etc
2020-04-30 09:19:39 -07:00
hongdd
9059bce977
[HUDI-702] Add test for HoodieLogFileCommand ( #1522 )
2020-04-29 18:47:27 +08:00
Raymond Xu
69b16309c8
[HUDI-814] Migrate hudi-client tests to JUnit 5 ( #1570 )
2020-04-29 13:57:28 +08:00
Raymond Xu
06dae30297
[HUDI-810] Migrate ClientTestHarness to JUnit 5 ( #1553 )
2020-04-28 23:38:16 +08:00
satishkotha
6de9f5d9e5
[HUDI-819] Fix a bug with MergeOnReadLazyInsertIterable.
...
Variable declared here[1] masks protected statuses variable. So although hoodie writes data, will not include writestatus in the completed section. This can cause duplicates being written (#1540 )
[1] https://github.com/apache/incubator-hudi/blob/master/hudi-client/src/main/java/org/apache/hudi/execution/MergeOnReadLazyInsertIterable.java#L53
2020-04-27 12:50:39 -07:00
vinoth chandar
19ca0b5629
[HUDI-785] Refactor compaction/savepoint execution based on ActionExector abstraction ( #1548 )
...
- Savepoint and compaction classes moved to table.action.* packages
- HoodieWriteClient#savepoint(...) returns void
- Renamed HoodieCommitArchiveLog -> HoodieTimelineArchiveLog
- Fixed tests to take into account the additional validation done
- Moved helper code into CompactHelpers and SavepointHelpers
2020-04-25 18:26:44 -07:00
dengziming
19cc15c098
[MINOR]: Fix cli docs for DeltaStreamer ( #1547 )
2020-04-22 11:37:17 -07:00
Alexander Filipchik
aea7c1657e
[HUDI-795] Handle auto-deleted empty aux folder ( #1515 )
...
Co-authored-by: Alex Filipchik <alex.filipchik@csscompany.com >
2020-04-22 09:47:32 -07:00
leesf
26684f5984
[HUDI-816] Fixed MAX_MEMORY_FOR_MERGE_PROP and MAX_MEMORY_FOR_COMPACTION_PROP do not work due to HUDI-678 ( #1536 )
2020-04-22 16:33:18 +08:00
Raymond Xu
6e15eebd81
[HUDI-809] Migrate CommonTestHarness to JUnit 5 ( #1530 )
2020-04-22 14:10:25 +08:00
Alexander Filipchik
2a56f82908
[HUDI-821] Fixing JCommander param parsing in deltastreamer ( #1525 )
...
Co-authored-by: Alex Filipchik <alex.filipchik@csscompany.com >
2020-04-21 20:12:34 -07:00
Prashant Wason
62bd3e7ded
[HUDI-757] Added hudi-cli command to export metadata of Instants.
...
Example:
hudi:db.table-> export instants --localFolder /tmp/ --limit 5 --actions clean,rollback,commit --desc false
2020-04-21 12:41:19 -07:00
hongdd
84dd9047d3
[HUDI-789]Adjust logic of upsert in HDFSParquetImporter ( #1511 )
2020-04-21 14:21:30 +08:00
n3nash
332072bc6d
[HUDI-371] Supporting hive combine input format for realtime tables ( #1503 )
2020-04-20 20:40:06 -07:00
Mathieu
2a2f31d919
[MINOR] Remove reduntant code and fix typo in HoodieDefaultTimeline ( #1535 )
2020-04-21 09:40:22 +08:00
Dongwook
ddd105bb31
[HUDI-772] Make UserDefinedBulkInsertPartitioner configurable for DataSource ( #1500 )
2020-04-20 08:38:18 -07:00
lw0090
09fd6f64c5
[HUDI-800] Fix Metrics getReporter().close() throws NPE. ( #1529 )
2020-04-19 21:33:07 +08:00
baobaoyeye
75523657a4
[MINOR] use Option and fix description in toString method ( #1527 )
...
* [MINOR] fix some places are not elegant, as a newcomer
* [MINOR] fix some places are not elegant, as a newcomer
2020-04-18 12:51:37 +08:00
Alexander Filipchik
acb1ada2f7
[HUDI-799] Use appropriate FS when loading configs ( #1517 )
...
Co-authored-by: Alex Filipchik <alex.filipchik@csscompany.com >
2020-04-16 13:49:39 -07:00
Raymond Xu
acdc4a8d00
[HUDI-798] Migrate to Mockito Jupiter for JUnit 5 ( #1521 )
2020-04-16 16:07:32 +08:00
Prashant Wason
19d29ac7d0
[HUDI-741] Added checks to validate Hoodie's schema evolution.
...
HUDI specific validation of schema evolution should ensure that a newer schema can be used for the dataset by checking that the data written using the old schema can be read using the new schema.
Code changes:
1. Added a new config in HoodieWriteConfig to enable schema validation check (disabled by default)
2. Moved code that reads schema from base/log files into hudi-common from hudi-hive-sync
3. Added writerSchema to the extraMetadata of compaction commits in MOR table. This is same as that for commits on COW table.
Testing changes:
4. Extended TestHoodieClientBase to add insertBatch API which allows inserting a new batch of unique records into a HUDI table
5. Added a unit test to verify schema evolution for both COW and MOR tables.
6. Added unit tests for schema compatiblity checks.
2020-04-15 23:34:59 -07:00
Iftach Schonbaum
9ca710cb02
[HUDI-777] Updated description for --target-table parameter ( #1519 )
2020-04-15 14:56:13 -07:00
Raymond Xu
d65efe659d
[HUDI-780] Migrate test cases to Junit 5 ( #1504 )
2020-04-15 12:35:01 -07:00
Gary Li
14d4fea833
[HUDI-759] Integrate checkpoint provider with delta streamer ( #1486 )
2020-04-14 14:51:04 -07:00
hongdd
644c1cc8bd
[HUDI-698]Add unit test for CleansCommand ( #1449 )
2020-04-14 17:54:47 +08:00
vinoth chandar
661b0b3bab
[HUDI-761] Refactoring rollback and restore actions using the ActionExecutor abstraction ( #1492 )
...
- rollback() and restore() table level APIs introduced
- Restore is implemented by wrapping calls to rollback executor
- Existing tests transparently cover this, since its just a refactor
2020-04-13 08:29:19 -07:00
Balaji Varadarajan
17bf930342
[HUDI-770] Organize upsert/insert API implementation under a single package ( #1495 )
2020-04-12 23:11:00 -07:00
Sivabalan Narayanan
447ba3bae6
[MINOR] Disabling flaky test in InlineFileSystem ( #1510 )
2020-04-12 19:38:56 -07:00
Pratyaksh Sharma
6d7ca2cf7e
[HUDI-727]: Copy default values of fields if not present when rewriting incoming record with new schema ( #1427 )
2020-04-12 17:55:26 -07:00
Shen Hong
5d717a28f4
[HUDI-782] Add support of Aliyun object storage service. ( #1506 )
2020-04-12 10:06:30 +08:00
hongdd
a464a2972e
[HUDI-700]Add unit test for FileSystemViewCommand ( #1490 )
2020-04-11 10:12:21 +08:00
satishkotha
c0f96e0726
[HUDI-687] Stop incremental reader on RO table when there is a pending compaction ( #1396 )
2020-04-10 10:45:41 -07:00
Bhavani Sudha Saktheeswaran
8c7cef3e50
[HUDI - 738] Add validation to DeltaStreamer to fail fast when filterDupes is enabled on UPSERT mode. ( #1505 )
...
Summary:
This fix ensures for UPSERT operation, '--filter-dupes' is disabled and fails fast if not. Otherwise it would drop all updates silently and only take in new records.
2020-04-10 08:58:55 -07:00