1
0
Commit Graph

64 Commits

Author SHA1 Message Date
lamber-ken
dbc9acd23a [HUDI-716] Exception: Not an Avro data file when running HoodieCleanClient.runClean (#1432) 2020-03-30 11:19:17 -07:00
Suneel Marthi
fa36082554 [HUDI-746] Reduce build warnings < 10 (#1465) 2020-03-30 11:46:52 +08:00
vinoth chandar
e057c27603 [HUDI-744] Restructure hudi-common and clean up files under util packages (#1462)
- Brings more order and cohesion to the classes in hudi-common
 - Utils classes related to a particular concept (avro, timeline,...) are placed near to the package
 - common.fs package now contains all the filesystem level classes including wrapper filesystem
 - bloom.filter package renamed to just bloom
 - config package contains classes that help store properties
 - common.fs.inline package contains all the inline filesystem classes/impl
 - common.table.timeline now consolidates all timeline related classes
 - common.table.view consolidates all the classes related to filesystem view metadata
 - common.table.timeline.versioning contains all classes related to versioning of timeline
 - Fix few unit tests as a result
 - Moved the test packages around to match the source file move
 - Rename AvroUtils to TimelineMetadataUtils & minor fixes/typos
2020-03-29 10:58:49 -07:00
leesf
07c3c5d797 [HUDI-679] Make io package Spark free (#1460)
* [HUDI-679] Make io package Spark free
2020-03-29 16:54:00 +08:00
Suneel Marthi
8c3001363d HUDI-479: Eliminate or Minimize use of Guava if possible (#1159) 2020-03-28 03:11:32 -04:00
hongdd
cafc87041b [HUDI-697]Add unit test for ArchivedCommitsCommand (#1424) 2020-03-23 13:46:10 +08:00
Zhiyuan Zhao
0241b21f77 [HUDI-65] commitTime rename to instantTime (#1431) 2020-03-22 18:06:00 -07:00
hongdd
f1d7bb381d [HUDI-695]Add unit test for TableCommand (#1411) 2020-03-17 14:15:30 +08:00
hongdd
3ef9e885ca [HUDI-715] Fix duplicate name in TableCommand (#1410) 2020-03-16 17:19:57 +08:00
hongdd
55e6d34815 [HUDI-694]Add unit test for SparkEnvCommand (#1401)
* Add test for SparkEnvCommand
2020-03-16 11:52:40 +08:00
hongdd
0f892ef62c [HUDI-692] Add delete savepoint for cli (#1397)
* Add delete savepoint for cli
* Add check
* Move JavaSparkContext to try
2020-03-11 16:49:02 -07:00
satishkotha
7194514aff [HUDI-689] Change CLI command names to not have overlap (#1392) 2020-03-11 16:29:54 -07:00
Satish Kotha
3d3781810c [CLI] Add export to table 2020-03-06 08:53:23 -08:00
vinoth chandar
71170fafe7 [HUDI-554] Cleanup package structure in hudi-client (#1346)
- Just package, class moves and renames with the following intent
 - `client` now has all the various client classes, that do the transaction management
 - `func` renamed to `execution` and some helpers moved to `client/utils`
 - All compaction code under `io` now under `table/compact`
 - Rollback code under `table/rollback` and in general all code for individual operations under `table`
 - `exception` `config`, `metrics` left untouched
 - Moved the tests also accordingly
 - Fixed some flaky tests
2020-02-27 08:05:58 -08:00
Suneel Marthi
078d4825d9 [HUDI-624]: Split some of the code from PR for HUDI-479 (#1344) 2020-02-21 14:22:21 +08:00
Satish Kotha
20ed2516d3 [HUDI-571] Add show archived compaction(s) to CLI 2020-02-14 10:58:28 -08:00
lamber-ken
01c868ab86 [HUDI-574] Fix CLI counts small file inserts as updates (#1321) 2020-02-13 22:20:58 +08:00
Satish Kotha
63b42166b1 CLI - add option to print additional commit metadata 2020-02-12 14:11:24 -08:00
Satish Kotha
462fd02556 [HUDI-571] Add 'commits show archived' command to CLI 2020-02-05 11:25:34 -08:00
lamber-ken
46842f4e92 [MINOR] Remove the declaration of thrown RuntimeException (#1305) 2020-02-05 23:23:20 +08:00
Balaji Varadarajan
923e2b4a1e [HUDI-535] Ensure Compaction Plan is always written in .aux folder to avoid 0.5.0/0.5.1 reader-writer compatibility issues (#1229) 2020-01-17 10:56:35 -08:00
vinoth chandar
baa6b5e889 [HUDI-537] Introduce repair overwrite-hoodie-props CLI command (#1241) 2020-01-17 01:21:44 -08:00
vinoth chandar
c2c0f6b13d [HUDI-509] Renaming code in sync with cWiki restructuring (#1212)
- Storage Type replaced with Table Type (remaining instances)
 - View types replaced with query types;
 - ReadOptimized view referred as Snapshot Query
 - TableFileSystemView sub interfaces renamed to BaseFileOnly and Slice Views
 - HoodieDataFile renamed to HoodieBaseFile
 - Hive Sync tool will register RO tables for MOR with a `_ro` suffix
 - Datasource/Deltastreamer options renamed accordingly
 - Support fallback to old config values as well, so migration is painless
 - Config for controlling _ro suffix addition
 - Renaming DataFile to BaseFile across DTOs, HoodieFileSlice and AbstractTableFileSystemView
2020-01-16 23:58:47 -08:00
leesf
04afac977d [HUDI-248] CLI doesn't allow rolling back a Delta commit 2020-01-10 16:10:35 -08:00
vinoth chandar
9706f659db [HUDI-508] Standardizing on "Table" instead of "Dataset" across code (#1197)
- Docs were talking about storage types before, cWiki moved to "Table"
 - Most of code already has HoodieTable, HoodieTableMetaClient - correct naming
 - Replacing renaming use of dataset across code/comments
 - Few usages in comments and use of Spark SQL DataSet remain unscathed
2020-01-07 12:52:32 -08:00
SteNicholas
a733f4ef72 [MINOR] Optimize hudi-cli module (#1136) 2020-01-04 09:05:50 -08:00
Pratyaksh Sharma
290278fc6c [HUDI-118]: Options provided for passing properties to Cleaner, compactor and importer commands 2020-01-03 16:00:57 -08:00
hongdongdong
ff1113f3b7 [HUDI-492]Fix show env all in hudi-cli 2020-01-03 15:50:20 -08:00
Suneel Marthi
add4b1e329 Merge pull request #1143 from BigDataArtisans/outoflimit
[MINOR] Fix out of limits for results
2019-12-31 02:08:54 -05:00
lamber-ken
619f501054 Clean up code 2019-12-31 13:59:26 +08:00
lamber-ken
ab6ae5cebb [HUDI-482] Fix missing @Override annotation on methods (#1156)
* [HUDI-482] Fix missing @Override annotation on methods
2019-12-31 11:44:56 +08:00
lamber-ken
36c0e6bae1 [MINOR] Fix out of limits for results 2019-12-27 01:16:24 +08:00
lamber-ken
bb90dedfc8 [MINOR] Fix out of limits for results 2019-12-27 01:13:47 +08:00
lamber-ken
842eabb27f [HUDI-470] Fix NPE when print result via hudi-cli (#1138) 2019-12-26 15:40:38 +08:00
Mathieu
3c811ec29b [MINOR] fix typos 2019-12-25 20:26:16 +08:00
hongdd
8affdf8bcb [HUDI-416] Improve hint information for cli (#1110) 2019-12-25 20:19:12 +08:00
lamber-ken
313fab5fd1 [HUDI-444] Refactor the codes based on scala codestyle ReturnChecker rule (#1121) 2019-12-24 07:05:54 +08:00
Sivabalan Narayanan
14881e99e0 [HUDI-106] Adding support for DynamicBloomFilter (#976)
- Introduced configs for bloom filter type
- Implemented dynamic bloom filter with configurable max number of keys
- BloomFilterFactory abstractions; Defaults to current simple bloom filter
2019-12-17 19:06:24 -08:00
Balaji Varadarajan
9a1f698eef [HUDI-308] Avoid Renames for tracking state transitions of all actions on dataset 2019-12-15 21:26:30 -08:00
hongdd
8963a68e6a [HUDI-398]Add spark env set/get for spark launcher (#1096) 2019-12-14 14:13:00 +08:00
lamber-ken
ba514cfea0 [MINOR] Remove redundant plus operator (#1097) 2019-12-12 05:42:05 +08:00
lamber-ken
24a09c775f [HUDI-387] Fix NPE when create savepoint via hudi-cli (#1085) 2019-12-10 08:00:53 -08:00
lamber-ken
d447e2d751 [checkstyle] Unify LOG form (#1092) 2019-12-10 19:23:38 +08:00
lamber-ken
70a1040998 [MINOR] Beautify the cli banner (#1089)
* Add one empty line
* replace Cli to CLI
* replace Hoodie to Apache Hudi
2019-12-09 13:24:42 -08:00
lamber-ken
b3e0ebbc4a [checkstyle] Add ConstantName java checkstyle rule (#1066)
* add SimplifyBooleanExpression java checkstyle rule
* collapse empty tags in scalastyle file
2019-12-04 18:59:15 +08:00
Gurudatt Kulkarni
b2d9638bea [HUDI-365] Refactor hudi-cli based on new ImportOrder code style rule (#1076) 2019-12-04 15:10:40 +08:00
Gurudatt Kulkarni
75132c139f [HUDI-357] Refactor hudi-cli based on new comment and code style rules (#1051) 2019-11-30 11:12:41 -08:00
hongdd
44823041a3 [HUDI-362] Adds a check for the existence of field (#1047) 2019-11-25 11:31:07 -08:00
谢磊
804e348d0e [HUDI-346] Set allowMultipleEmptyLines to false for EmptyLineSeparator rule (#1025) 2019-11-19 18:44:42 +08:00
Balaji Varadarajan
1032fc3e54 [HUDI-137] Hudi cleaning state changes should be consistent with compaction actions
Before this change, Cleaner performs cleaning of old file versions and then stores the deleted files in .clean files.
With this setup, we will not be able to track file deletions if a cleaner fails after deleting files but before writing .clean metadata.
This is fine for regular file-system view generation but Incremental timeline syncing relies on clean/commit/compaction metadata to keep a consistent file-system view.

Cleaner state transitions is now similar to that of compaction.

1. Requested : HoodieWriteClient.scheduleClean() selects the list of files that needs to be deleted and stores them in metadata
2. Inflight : HoodieWriteClient marks the state to be inflight before it starts deleting
3. Completed : HoodieWriteClient marks the state after completing the deletion according to the cleaner plan
2019-11-11 10:40:16 -08:00