1
0
Commit Graph

98 Commits

Author SHA1 Message Date
hongdd
57132f79bb [HUDI-705] Add unit test for RollbacksCommand (#1611) 2020-05-18 14:04:06 +08:00
hongdd
3a2fe13fcb [HUDI-701] Add unit test for HDFSParquetImportCommand (#1574) 2020-05-14 19:15:49 +08:00
Shen Hong
295d00beea [HUDI-880] Replace part of spark context by hadoop configuration in HoodieTable. (#1614) 2020-05-11 23:33:57 -07:00
Balaji Varadarajan
8d0e23173b [HUDI-820] cleaner repair command should only inspect clean metadata files (#1542) 2020-05-11 09:25:54 +08:00
hongdd
f921469afc [HUDI-704] Add test for RepairsCommand (#1554) 2020-05-07 23:02:28 +08:00
vinoth chandar
c4b71622b9 [MINOR] Reorder HoodieTimeline#compareTimestamp arguments for better readability (#1575)
- reads nicely as (instantTime1, GREATER_THAN_OR_EQUALS, instantTime2) etc
2020-04-30 09:19:39 -07:00
hongdd
9059bce977 [HUDI-702] Add test for HoodieLogFileCommand (#1522) 2020-04-29 18:47:27 +08:00
Raymond Xu
06dae30297 [HUDI-810] Migrate ClientTestHarness to JUnit 5 (#1553) 2020-04-28 23:38:16 +08:00
vinoth chandar
19ca0b5629 [HUDI-785] Refactor compaction/savepoint execution based on ActionExector abstraction (#1548)
- Savepoint and compaction classes moved to table.action.* packages
 - HoodieWriteClient#savepoint(...) returns void
 - Renamed HoodieCommitArchiveLog -> HoodieTimelineArchiveLog
 - Fixed tests to take into account the additional validation done
 - Moved helper code into CompactHelpers and SavepointHelpers
2020-04-25 18:26:44 -07:00
Prashant Wason
62bd3e7ded [HUDI-757] Added hudi-cli command to export metadata of Instants.
Example:
hudi:db.table-> export instants --localFolder /tmp/ --limit 5 --actions clean,rollback,commit --desc false
2020-04-21 12:41:19 -07:00
Raymond Xu
acdc4a8d00 [HUDI-798] Migrate to Mockito Jupiter for JUnit 5 (#1521) 2020-04-16 16:07:32 +08:00
Prashant Wason
19d29ac7d0 [HUDI-741] Added checks to validate Hoodie's schema evolution.
HUDI specific validation of schema evolution should ensure that a newer schema can be used for the dataset by checking that the data written using the old schema can be read using the new schema.

Code changes:

1. Added a new config in HoodieWriteConfig to enable schema validation check (disabled by default)
2. Moved code that reads schema from base/log files into hudi-common from hudi-hive-sync
3. Added writerSchema to the extraMetadata of compaction commits in MOR table. This is same as that for commits on COW table.

Testing changes:

4. Extended TestHoodieClientBase to add insertBatch API which allows inserting a new batch of unique records into a HUDI table
5. Added a unit test to verify schema evolution for both COW and MOR tables.
6. Added unit tests for schema compatiblity checks.
2020-04-15 23:34:59 -07:00
Raymond Xu
d65efe659d [HUDI-780] Migrate test cases to Junit 5 (#1504) 2020-04-15 12:35:01 -07:00
hongdd
644c1cc8bd [HUDI-698]Add unit test for CleansCommand (#1449) 2020-04-14 17:54:47 +08:00
vinoth chandar
661b0b3bab [HUDI-761] Refactoring rollback and restore actions using the ActionExecutor abstraction (#1492)
- rollback() and restore() table level APIs introduced
- Restore is implemented by wrapping calls to rollback executor
- Existing tests transparently cover this, since its just a refactor
2020-04-13 08:29:19 -07:00
hongdd
a464a2972e [HUDI-700]Add unit test for FileSystemViewCommand (#1490) 2020-04-11 10:12:21 +08:00
hongdd
4e5c8671ef [HUDI-740]Fix can not specify the sparkMaster and code clean for SparkUtil (#1452) 2020-04-08 21:33:15 +08:00
Ramachandran Madtas Subramaniam
639ec20412 [HUDI-562] Enable testing at debug log level
This is to ensure that tests will execute all code paths, even the ones
written under DEBUG log levels. This will improve coverage as well as
ensure there are no surprised when DEBUG log level is enabled in
production.
2020-04-02 11:14:35 -07:00
Shaofeng Shi
78b3194e82 [HUDI-751] Fix some coding issues reported by FindBugs (#1470) 2020-03-31 21:19:32 +08:00
lamber-ken
dbc9acd23a [HUDI-716] Exception: Not an Avro data file when running HoodieCleanClient.runClean (#1432) 2020-03-30 11:19:17 -07:00
Suneel Marthi
fa36082554 [HUDI-746] Reduce build warnings < 10 (#1465) 2020-03-30 11:46:52 +08:00
vinoth chandar
e057c27603 [HUDI-744] Restructure hudi-common and clean up files under util packages (#1462)
- Brings more order and cohesion to the classes in hudi-common
 - Utils classes related to a particular concept (avro, timeline,...) are placed near to the package
 - common.fs package now contains all the filesystem level classes including wrapper filesystem
 - bloom.filter package renamed to just bloom
 - config package contains classes that help store properties
 - common.fs.inline package contains all the inline filesystem classes/impl
 - common.table.timeline now consolidates all timeline related classes
 - common.table.view consolidates all the classes related to filesystem view metadata
 - common.table.timeline.versioning contains all classes related to versioning of timeline
 - Fix few unit tests as a result
 - Moved the test packages around to match the source file move
 - Rename AvroUtils to TimelineMetadataUtils & minor fixes/typos
2020-03-29 10:58:49 -07:00
leesf
07c3c5d797 [HUDI-679] Make io package Spark free (#1460)
* [HUDI-679] Make io package Spark free
2020-03-29 16:54:00 +08:00
Suneel Marthi
8c3001363d HUDI-479: Eliminate or Minimize use of Guava if possible (#1159) 2020-03-28 03:11:32 -04:00
hongdd
cafc87041b [HUDI-697]Add unit test for ArchivedCommitsCommand (#1424) 2020-03-23 13:46:10 +08:00
Zhiyuan Zhao
0241b21f77 [HUDI-65] commitTime rename to instantTime (#1431) 2020-03-22 18:06:00 -07:00
hongdd
f1d7bb381d [HUDI-695]Add unit test for TableCommand (#1411) 2020-03-17 14:15:30 +08:00
hongdd
3ef9e885ca [HUDI-715] Fix duplicate name in TableCommand (#1410) 2020-03-16 17:19:57 +08:00
hongdd
55e6d34815 [HUDI-694]Add unit test for SparkEnvCommand (#1401)
* Add test for SparkEnvCommand
2020-03-16 11:52:40 +08:00
hongdd
0f892ef62c [HUDI-692] Add delete savepoint for cli (#1397)
* Add delete savepoint for cli
* Add check
* Move JavaSparkContext to try
2020-03-11 16:49:02 -07:00
satishkotha
7194514aff [HUDI-689] Change CLI command names to not have overlap (#1392) 2020-03-11 16:29:54 -07:00
lamber-ken
170ee88457 [HUDI-553] Building/Running Hudi on higher java versions (#1369) 2020-03-07 01:27:40 -08:00
Satish Kotha
3d3781810c [CLI] Add export to table 2020-03-06 08:53:23 -08:00
lamber-ken
ccbf543607 [HUDI-654] Rename hudi-hive to hudi-hive-sync 2020-03-06 22:13:16 +08:00
yanghua
0dc8e493aa Moving to 0.6.0-SNAPSHOT on master branch. 2020-03-01 15:08:30 +08:00
vinoth chandar
71170fafe7 [HUDI-554] Cleanup package structure in hudi-client (#1346)
- Just package, class moves and renames with the following intent
 - `client` now has all the various client classes, that do the transaction management
 - `func` renamed to `execution` and some helpers moved to `client/utils`
 - All compaction code under `io` now under `table/compact`
 - Rollback code under `table/rollback` and in general all code for individual operations under `table`
 - `exception` `config`, `metrics` left untouched
 - Moved the tests also accordingly
 - Fixed some flaky tests
2020-02-27 08:05:58 -08:00
lamber-ken
11fb2c2614 [HUDI-580] Fix incorrect license header in files 2020-02-25 08:54:26 -08:00
Suneel Marthi
078d4825d9 [HUDI-624]: Split some of the code from PR for HUDI-479 (#1344) 2020-02-21 14:22:21 +08:00
Satish Kotha
20ed2516d3 [HUDI-571] Add show archived compaction(s) to CLI 2020-02-14 10:58:28 -08:00
lamber-ken
01c868ab86 [HUDI-574] Fix CLI counts small file inserts as updates (#1321) 2020-02-13 22:20:58 +08:00
Satish Kotha
63b42166b1 CLI - add option to print additional commit metadata 2020-02-12 14:11:24 -08:00
Satish Kotha
462fd02556 [HUDI-571] Add 'commits show archived' command to CLI 2020-02-05 11:25:34 -08:00
lamber-ken
46842f4e92 [MINOR] Remove the declaration of thrown RuntimeException (#1305) 2020-02-05 23:23:20 +08:00
leesf
6e59c1c777 Moving to 0.5.2-SNAPSHOT on master branch. 2020-01-20 10:51:33 -08:00
wenningd
292c1e2ff4 [HUDI-238] Make Hudi support Scala 2.12 (#1226)
* [HUDI-238] Rename scala related artifactId & add maven profile to support Scala 2.12
2020-01-17 14:02:21 -08:00
Balaji Varadarajan
923e2b4a1e [HUDI-535] Ensure Compaction Plan is always written in .aux folder to avoid 0.5.0/0.5.1 reader-writer compatibility issues (#1229) 2020-01-17 10:56:35 -08:00
Prashant Wason
0a07752dc0 [HUDI-527] scalastyle-maven-plugin moved to pluginManagement as it is only used in hoodie-spark and hoodie-cli modules.
This fixes compile warnings as well as unnecessary plugin invocation for most of the modules which do not have scala code.
2020-01-17 10:46:10 -08:00
vinoth chandar
baa6b5e889 [HUDI-537] Introduce repair overwrite-hoodie-props CLI command (#1241) 2020-01-17 01:21:44 -08:00
vinoth chandar
c2c0f6b13d [HUDI-509] Renaming code in sync with cWiki restructuring (#1212)
- Storage Type replaced with Table Type (remaining instances)
 - View types replaced with query types;
 - ReadOptimized view referred as Snapshot Query
 - TableFileSystemView sub interfaces renamed to BaseFileOnly and Slice Views
 - HoodieDataFile renamed to HoodieBaseFile
 - Hive Sync tool will register RO tables for MOR with a `_ro` suffix
 - Datasource/Deltastreamer options renamed accordingly
 - Support fallback to old config values as well, so migration is painless
 - Config for controlling _ro suffix addition
 - Renaming DataFile to BaseFile across DTOs, HoodieFileSlice and AbstractTableFileSystemView
2020-01-16 23:58:47 -08:00
leesf
04afac977d [HUDI-248] CLI doesn't allow rolling back a Delta commit 2020-01-10 16:10:35 -08:00