1
0
Commit Graph

154 Commits

Author SHA1 Message Date
Zhiyuan Zhao
b5d093a21b [MINOR] Clear up the redundant comment. (#1489) 2020-04-06 16:31:54 +08:00
vinoth chandar
eaf6cc2d90 [HUDI-756] Organize Cleaning Action execution into a single package in hudi-client (#1485)
- Introduced a thin abstraction ActionExecutor, that all actions will implement
- Pulled cleaning code from table, writeclient into a single package
- CleanHelper is now CleanPlanner, HoodieCleanClient is no longer around
- Minor refactor of HoodieTable factory method
- HoodieTable.create() methods with and without metaclient passed in
- HoodieTable constructor now does not do a redundant instantiation
- Fixed existing unit tests to work at the HoodieWriteClient level
2020-04-04 00:07:34 -07:00
Ramachandran Madtas Subramaniam
639ec20412 [HUDI-562] Enable testing at debug log level
This is to ensure that tests will execute all code paths, even the ones
written under DEBUG log levels. This will improve coverage as well as
ensure there are no surprised when DEBUG log level is enabled in
production.
2020-04-02 11:14:35 -07:00
Shaofeng Shi
78b3194e82 [HUDI-751] Fix some coding issues reported by FindBugs (#1470) 2020-03-31 21:19:32 +08:00
lamber-ken
dbc9acd23a [HUDI-716] Exception: Not an Avro data file when running HoodieCleanClient.runClean (#1432) 2020-03-30 11:19:17 -07:00
Suneel Marthi
fa36082554 [HUDI-746] Reduce build warnings < 10 (#1465) 2020-03-30 11:46:52 +08:00
vinoth chandar
e057c27603 [HUDI-744] Restructure hudi-common and clean up files under util packages (#1462)
- Brings more order and cohesion to the classes in hudi-common
 - Utils classes related to a particular concept (avro, timeline,...) are placed near to the package
 - common.fs package now contains all the filesystem level classes including wrapper filesystem
 - bloom.filter package renamed to just bloom
 - config package contains classes that help store properties
 - common.fs.inline package contains all the inline filesystem classes/impl
 - common.table.timeline now consolidates all timeline related classes
 - common.table.view consolidates all the classes related to filesystem view metadata
 - common.table.timeline.versioning contains all classes related to versioning of timeline
 - Fix few unit tests as a result
 - Moved the test packages around to match the source file move
 - Rename AvroUtils to TimelineMetadataUtils & minor fixes/typos
2020-03-29 10:58:49 -07:00
Sivabalan Narayanan
ac73bdcdc3 [HUDI-430] Adding InlineFileSystem to support embedding any file format as an InlineFile (#1176)
* Adding InlineFileSystem to support embedding any file format (parquet, hfile, etc). Supports reading the embedded file using respective readers.
2020-03-28 12:13:35 -04:00
Suneel Marthi
04449f33fe [HUDI-743]: Remove FileIOUtils.close() (#1461) 2020-03-28 18:03:15 +08:00
Suneel Marthi
8c3001363d HUDI-479: Eliminate or Minimize use of Guava if possible (#1159) 2020-03-28 03:11:32 -04:00
Zhiyuan Zhao
0241b21f77 [HUDI-65] commitTime rename to instantTime (#1431) 2020-03-22 18:06:00 -07:00
Zhiyuan Zhao
14e0c95206 [HUDI-400] Check upgrade from old plan to new plan for compaction (#1422)
* Fix NPE when DataFile is null
* Check from old plan upgrade to new plan
2020-03-20 15:13:17 +08:00
Suneel Marthi
99b7e9eb9e [HUDI-629]: Replace Guava's Hashing with an equivalent in NumericUtils.java (#1350)
* [HUDI-629]: Replace Guava's Hashing with an equivalent in NumericUtils.java
2020-03-13 20:28:05 -04:00
Prashant Wason
cf0a4c19bc [HUDI-670] Added test cases for TestDiskBasedMap. (#1379)
* [HUDI-670] Added test cases for TestDiskBasedMap.

* Update TestDiskBasedMap.java

Co-authored-by: Suneel Marthi <smarthi@apache.org>
2020-03-11 08:03:03 -04:00
lamber-ken
170ee88457 [HUDI-553] Building/Running Hudi on higher java versions (#1369) 2020-03-07 01:27:40 -08:00
Ramachandran M S
9d46ce380a [HUDI -409] Match header and footer block length to improve corrupted block detection (#1332) 2020-03-03 13:26:54 -08:00
hongdd
8306205d7a [HUDI-332]Add operation type (insert/upsert/bulkinsert/delete) to HoodieCommitMetadata (#1157)
[HUDI-332]Add operation type (insert/upsert/bulkinsert/delete) to HoodieCommitMetadata (#1157)
2020-03-03 10:10:29 -08:00
vinoth chandar
71170fafe7 [HUDI-554] Cleanup package structure in hudi-client (#1346)
- Just package, class moves and renames with the following intent
 - `client` now has all the various client classes, that do the transaction management
 - `func` renamed to `execution` and some helpers moved to `client/utils`
 - All compaction code under `io` now under `table/compact`
 - Rollback code under `table/rollback` and in general all code for individual operations under `table`
 - `exception` `config`, `metrics` left untouched
 - Moved the tests also accordingly
 - Fixed some flaky tests
2020-02-27 08:05:58 -08:00
Ramachandran M S
b7f35be452 [HUDI-618] Adding unit tests for PriorityBasedFileSystemView (#1345)
[HUDI-618] Adding unit tests for PriorityBasedFileSystemView
2020-02-26 10:55:02 -08:00
lamber-ken
83c8ad5a38 [HUDI-625] Fixing performance issues around DiskBasedMap & kryo (#1352) 2020-02-24 22:40:37 -08:00
Suneel Marthi
078d4825d9 [HUDI-624]: Split some of the code from PR for HUDI-479 (#1344) 2020-02-21 14:22:21 +08:00
Nishith Agarwal
185ff646ad Refactoring getter to avoid double extrametadata in json representation 2020-02-20 09:52:02 -08:00
Suneel Marthi
f9d2f66dc1 [HUDI-622]: Remove VisibleForTesting annotation and import from code (#1343)
* HUDI:622: Remove VisibleForTesting annotation and import from code
2020-02-20 15:17:53 +08:00
Suneel Marthi
b8f9d0ec45 [HUDI-615]: Add some methods and test cases for StringUtils. (#1338) 2020-02-17 14:13:33 +08:00
Suneel Marthi
24e73816b2 [MINOR] Code Cleanup, remove redundant code (#1337) 2020-02-15 22:03:29 +08:00
lamber-ken
d2c872ede4 [HUDI-605] Avoid calculating the size of schema redundantly (#1317) 2020-02-12 19:40:52 +08:00
Balajee Nagasubramaniam
1fb0b001a3 [HUDI-570] - Improve test coverage for FSUtils.java 2020-02-05 14:25:24 -08:00
Satish Kotha
462fd02556 [HUDI-571] Add 'commits show archived' command to CLI 2020-02-05 11:25:34 -08:00
lamber-ken
46842f4e92 [MINOR] Remove the declaration of thrown RuntimeException (#1305) 2020-02-05 23:23:20 +08:00
Prashant Wason
4de0fcfcb5 [HUDI-566] Added new test cases for class HoodieTimeline, HoodieDefaultTimeline and HoodieActiveTimeline. 2020-02-04 18:55:04 -08:00
Suneel Marthi
594da28fbf [HUDI-595] code cleanup, refactoring code out of PR# 1159 (#1302) 2020-02-04 21:52:03 +08:00
Suneel Marthi
5b7bb142dc [HUDI-583] Code Cleanup, remove redundant code, and other changes (#1237) 2020-02-02 18:03:44 +08:00
Prashant Wason
f27c7a16c6 [HUDI-564] Added new test cases for HoodieLogFormat and HoodieLogFormatVersion. 2020-01-30 13:53:18 -08:00
Balajee Nagasubramaniam
6f34be1b8d HUDI-117 Close file handle before throwing an exception due to append failure.
Add test cases to handle/verify stage failure scenarios.
2020-01-29 15:28:51 -08:00
Mathieu
b6e2993ceb [MINOR] Update the javadoc of HoodieTableMetaClient#scanFiles (#1263)
[MINOR] Update the javadoc of HoodieTableMetaClient#scanFiles
2020-01-21 15:50:40 +08:00
Balaji Varadarajan
ba54a7e973 [HUDI-559] : Make the timeline layout version default to be null version 2020-01-20 00:02:55 -08:00
leesf
5471d8f0c2 [MINOR] Add toString method to TimelineLayoutVersion to make it more readable (#1244) 2020-01-17 20:22:55 -05:00
Balaji Varadarajan
923e2b4a1e [HUDI-535] Ensure Compaction Plan is always written in .aux folder to avoid 0.5.0/0.5.1 reader-writer compatibility issues (#1229) 2020-01-17 10:56:35 -08:00
vinoth chandar
c2c0f6b13d [HUDI-509] Renaming code in sync with cWiki restructuring (#1212)
- Storage Type replaced with Table Type (remaining instances)
 - View types replaced with query types;
 - ReadOptimized view referred as Snapshot Query
 - TableFileSystemView sub interfaces renamed to BaseFileOnly and Slice Views
 - HoodieDataFile renamed to HoodieBaseFile
 - Hive Sync tool will register RO tables for MOR with a `_ro` suffix
 - Datasource/Deltastreamer options renamed accordingly
 - Support fallback to old config values as well, so migration is painless
 - Config for controlling _ro suffix addition
 - Renaming DataFile to BaseFile across DTOs, HoodieFileSlice and AbstractTableFileSystemView
2020-01-16 23:58:47 -08:00
lamber-ken
8a3a50309b [MINOR] Fix missing @Override annotation on BufferedRandomAccessFile method (#1236) 2020-01-16 11:14:39 -08:00
Balajee Nagasubramaniam
dd09abb56d [HUDI-335] Improvements to DiskBasedMap used by ExternalSpillableMap, for write and random/sequential read paths, by introducing bufferedRandmomAccessFile 2020-01-15 16:45:45 -08:00
lamber-ken
9b2944a9a2 [MINOR] Refactor unnecessary boxing inside TypedProperties code (#1227) 2020-01-14 19:27:53 -08:00
openopen2
a44c61b813 [HUDI-502] provide a custom time zone definition for TimestampBasedKeyGenerator (#1188) 2020-01-12 15:45:23 -08:00
lamber-ken
017ee8e661 [MINOR] Fix partition typo (#1209) 2020-01-12 09:35:55 +08:00
lamber-ken
e103165083 [CLEAN] replace utf-8 constant with StandardCharsets.UTF_8 2020-01-10 16:23:29 -08:00
Thinking
b95367d82a [HUDI-469] Fix: HoodieCommitMetadata only show first commit insert rows. 2020-01-10 16:17:11 -08:00
pratyakshsharma
3c90d252cc [HUDI-114]: added option to overwrite payload implementation in hoodie.properties file 2020-01-09 22:34:40 -08:00
vinoth chandar
9706f659db [HUDI-508] Standardizing on "Table" instead of "Dataset" across code (#1197)
- Docs were talking about storage types before, cWiki moved to "Table"
 - Most of code already has HoodieTable, HoodieTableMetaClient - correct naming
 - Replacing renaming use of dataset across code/comments
 - Few usages in comments and use of Spark SQL DataSet remain unscathed
2020-01-07 12:52:32 -08:00
Pratyaksh Sharma
dde21e7315 [HUDI-402]: code clean up in test cases 2019-12-31 11:10:49 -08:00
lamber-ken
ab6ae5cebb [HUDI-482] Fix missing @Override annotation on methods (#1156)
* [HUDI-482] Fix missing @Override annotation on methods
2019-12-31 11:44:56 +08:00