Sivabalan Narayanan
29edf4b3b8
[HUDI-407] Adding Simple Index to Hoodie. ( #1402 )
...
This index finds the location by joining incoming records with records from base files.
2020-05-17 18:32:24 -07:00
Balaji Varadarajan
3c9da2e5f0
[HUDI-895] Remove unnecessary listing .hoodie folder when using timeline server ( #1636 )
2020-05-17 18:18:53 -07:00
Mathieu
25a0080b2f
[HUDI-714]Add javadoc and comments to hudi write method link ( #1409 )
...
* [HUDI-714] Add javadoc and comments to hudi write method link
2020-05-16 08:36:51 -04:00
Alexander Filipchik
83796b3189
[HUDI-793] Adding proper default to hudi metadata fields and proper handling to rewrite routine ( #1513 )
...
* Adding proper default to hudi metadata fields and proper handling to rewrite routine
* Handle fields declared with a null default
Co-authored-by: Alex Filipchik <alex.filipchik@csscompany.com >
2020-05-13 18:04:38 -07:00
liujinhui
32ea4c70ff
[HUDI-869] Add support for alluxio ( #1608 )
2020-05-13 21:00:34 +08:00
Udit Mehrotra
d54b4b8a52
[HUDI-838] Support schema from HoodieCommitMetadata for HiveSync ( #1559 )
...
Co-authored-by: Mehrotra <uditme@amazon.com >
2020-05-07 16:33:09 -07:00
Alexander Filipchik
e783ab1749
[HUDI-784] Adressing issue with log reader on GCS ( #1516 )
...
[HUDI-784] Adressing issue with log reader on GCS (#1516 )
Co-authored-by: Alex Filipchik <alex.filipchik@csscompany.com >
2020-05-07 13:05:32 -07:00
vinoth chandar
c4b71622b9
[MINOR] Reorder HoodieTimeline#compareTimestamp arguments for better readability ( #1575 )
...
- reads nicely as (instantTime1, GREATER_THAN_OR_EQUALS, instantTime2) etc
2020-04-30 09:19:39 -07:00
Prashant Wason
62bd3e7ded
[HUDI-757] Added hudi-cli command to export metadata of Instants.
...
Example:
hudi:db.table-> export instants --localFolder /tmp/ --limit 5 --actions clean,rollback,commit --desc false
2020-04-21 12:41:19 -07:00
n3nash
332072bc6d
[HUDI-371] Supporting hive combine input format for realtime tables ( #1503 )
2020-04-20 20:40:06 -07:00
Mathieu
2a2f31d919
[MINOR] Remove reduntant code and fix typo in HoodieDefaultTimeline ( #1535 )
2020-04-21 09:40:22 +08:00
baobaoyeye
75523657a4
[MINOR] use Option and fix description in toString method ( #1527 )
...
* [MINOR] fix some places are not elegant, as a newcomer
* [MINOR] fix some places are not elegant, as a newcomer
2020-04-18 12:51:37 +08:00
Prashant Wason
19d29ac7d0
[HUDI-741] Added checks to validate Hoodie's schema evolution.
...
HUDI specific validation of schema evolution should ensure that a newer schema can be used for the dataset by checking that the data written using the old schema can be read using the new schema.
Code changes:
1. Added a new config in HoodieWriteConfig to enable schema validation check (disabled by default)
2. Moved code that reads schema from base/log files into hudi-common from hudi-hive-sync
3. Added writerSchema to the extraMetadata of compaction commits in MOR table. This is same as that for commits on COW table.
Testing changes:
4. Extended TestHoodieClientBase to add insertBatch API which allows inserting a new batch of unique records into a HUDI table
5. Added a unit test to verify schema evolution for both COW and MOR tables.
6. Added unit tests for schema compatiblity checks.
2020-04-15 23:34:59 -07:00
vinoth chandar
661b0b3bab
[HUDI-761] Refactoring rollback and restore actions using the ActionExecutor abstraction ( #1492 )
...
- rollback() and restore() table level APIs introduced
- Restore is implemented by wrapping calls to rollback executor
- Existing tests transparently cover this, since its just a refactor
2020-04-13 08:29:19 -07:00
Balaji Varadarajan
17bf930342
[HUDI-770] Organize upsert/insert API implementation under a single package ( #1495 )
2020-04-12 23:11:00 -07:00
Pratyaksh Sharma
6d7ca2cf7e
[HUDI-727]: Copy default values of fields if not present when rewriting incoming record with new schema ( #1427 )
2020-04-12 17:55:26 -07:00
Shen Hong
5d717a28f4
[HUDI-782] Add support of Aliyun object storage service. ( #1506 )
2020-04-12 10:06:30 +08:00
satishkotha
c0f96e0726
[HUDI-687] Stop incremental reader on RO table when there is a pending compaction ( #1396 )
2020-04-10 10:45:41 -07:00
Ramachandran Madtas Subramaniam
f5f34bb1c1
[HUDI-568] Improve unit test coverage
...
Classes improved:
* HoodieTableMetaClient
* RocksDBDAO
* HoodieRealtimeFileSplit
2020-04-09 10:15:34 -07:00
Zhiyuan Zhao
b5d093a21b
[MINOR] Clear up the redundant comment. ( #1489 )
2020-04-06 16:31:54 +08:00
vinoth chandar
eaf6cc2d90
[HUDI-756] Organize Cleaning Action execution into a single package in hudi-client ( #1485 )
...
- Introduced a thin abstraction ActionExecutor, that all actions will implement
- Pulled cleaning code from table, writeclient into a single package
- CleanHelper is now CleanPlanner, HoodieCleanClient is no longer around
- Minor refactor of HoodieTable factory method
- HoodieTable.create() methods with and without metaclient passed in
- HoodieTable constructor now does not do a redundant instantiation
- Fixed existing unit tests to work at the HoodieWriteClient level
2020-04-04 00:07:34 -07:00
Shaofeng Shi
78b3194e82
[HUDI-751] Fix some coding issues reported by FindBugs ( #1470 )
2020-03-31 21:19:32 +08:00
lamber-ken
dbc9acd23a
[HUDI-716] Exception: Not an Avro data file when running HoodieCleanClient.runClean ( #1432 )
2020-03-30 11:19:17 -07:00
Suneel Marthi
fa36082554
[HUDI-746] Reduce build warnings < 10 ( #1465 )
2020-03-30 11:46:52 +08:00
vinoth chandar
e057c27603
[HUDI-744] Restructure hudi-common and clean up files under util packages ( #1462 )
...
- Brings more order and cohesion to the classes in hudi-common
- Utils classes related to a particular concept (avro, timeline,...) are placed near to the package
- common.fs package now contains all the filesystem level classes including wrapper filesystem
- bloom.filter package renamed to just bloom
- config package contains classes that help store properties
- common.fs.inline package contains all the inline filesystem classes/impl
- common.table.timeline now consolidates all timeline related classes
- common.table.view consolidates all the classes related to filesystem view metadata
- common.table.timeline.versioning contains all classes related to versioning of timeline
- Fix few unit tests as a result
- Moved the test packages around to match the source file move
- Rename AvroUtils to TimelineMetadataUtils & minor fixes/typos
2020-03-29 10:58:49 -07:00
Sivabalan Narayanan
ac73bdcdc3
[HUDI-430] Adding InlineFileSystem to support embedding any file format as an InlineFile ( #1176 )
...
* Adding InlineFileSystem to support embedding any file format (parquet, hfile, etc). Supports reading the embedded file using respective readers.
2020-03-28 12:13:35 -04:00
Suneel Marthi
04449f33fe
[HUDI-743]: Remove FileIOUtils.close() ( #1461 )
2020-03-28 18:03:15 +08:00
Suneel Marthi
8c3001363d
HUDI-479: Eliminate or Minimize use of Guava if possible ( #1159 )
2020-03-28 03:11:32 -04:00
Zhiyuan Zhao
0241b21f77
[HUDI-65] commitTime rename to instantTime ( #1431 )
2020-03-22 18:06:00 -07:00
Zhiyuan Zhao
14e0c95206
[HUDI-400] Check upgrade from old plan to new plan for compaction ( #1422 )
...
* Fix NPE when DataFile is null
* Check from old plan upgrade to new plan
2020-03-20 15:13:17 +08:00
Suneel Marthi
99b7e9eb9e
[HUDI-629]: Replace Guava's Hashing with an equivalent in NumericUtils.java ( #1350 )
...
* [HUDI-629]: Replace Guava's Hashing with an equivalent in NumericUtils.java
2020-03-13 20:28:05 -04:00
lamber-ken
170ee88457
[HUDI-553] Building/Running Hudi on higher java versions ( #1369 )
2020-03-07 01:27:40 -08:00
Ramachandran M S
9d46ce380a
[HUDI -409] Match header and footer block length to improve corrupted block detection ( #1332 )
2020-03-03 13:26:54 -08:00
hongdd
8306205d7a
[HUDI-332]Add operation type (insert/upsert/bulkinsert/delete) to HoodieCommitMetadata ( #1157 )
...
[HUDI-332]Add operation type (insert/upsert/bulkinsert/delete) to HoodieCommitMetadata (#1157 )
2020-03-03 10:10:29 -08:00
Ramachandran M S
b7f35be452
[HUDI-618] Adding unit tests for PriorityBasedFileSystemView ( #1345 )
...
[HUDI-618] Adding unit tests for PriorityBasedFileSystemView
2020-02-26 10:55:02 -08:00
lamber-ken
83c8ad5a38
[HUDI-625] Fixing performance issues around DiskBasedMap & kryo ( #1352 )
2020-02-24 22:40:37 -08:00
Suneel Marthi
078d4825d9
[HUDI-624]: Split some of the code from PR for HUDI-479 ( #1344 )
2020-02-21 14:22:21 +08:00
Nishith Agarwal
185ff646ad
Refactoring getter to avoid double extrametadata in json representation
2020-02-20 09:52:02 -08:00
Suneel Marthi
f9d2f66dc1
[HUDI-622]: Remove VisibleForTesting annotation and import from code ( #1343 )
...
* HUDI:622: Remove VisibleForTesting annotation and import from code
2020-02-20 15:17:53 +08:00
Suneel Marthi
b8f9d0ec45
[HUDI-615]: Add some methods and test cases for StringUtils. ( #1338 )
2020-02-17 14:13:33 +08:00
Suneel Marthi
24e73816b2
[MINOR] Code Cleanup, remove redundant code ( #1337 )
2020-02-15 22:03:29 +08:00
lamber-ken
d2c872ede4
[HUDI-605] Avoid calculating the size of schema redundantly ( #1317 )
2020-02-12 19:40:52 +08:00
Balajee Nagasubramaniam
1fb0b001a3
[HUDI-570] - Improve test coverage for FSUtils.java
2020-02-05 14:25:24 -08:00
Satish Kotha
462fd02556
[HUDI-571] Add 'commits show archived' command to CLI
2020-02-05 11:25:34 -08:00
lamber-ken
46842f4e92
[MINOR] Remove the declaration of thrown RuntimeException ( #1305 )
2020-02-05 23:23:20 +08:00
Prashant Wason
4de0fcfcb5
[HUDI-566] Added new test cases for class HoodieTimeline, HoodieDefaultTimeline and HoodieActiveTimeline.
2020-02-04 18:55:04 -08:00
Suneel Marthi
594da28fbf
[HUDI-595] code cleanup, refactoring code out of PR# 1159 ( #1302 )
2020-02-04 21:52:03 +08:00
Suneel Marthi
5b7bb142dc
[HUDI-583] Code Cleanup, remove redundant code, and other changes ( #1237 )
2020-02-02 18:03:44 +08:00
Balajee Nagasubramaniam
6f34be1b8d
HUDI-117 Close file handle before throwing an exception due to append failure.
...
Add test cases to handle/verify stage failure scenarios.
2020-01-29 15:28:51 -08:00
Mathieu
b6e2993ceb
[MINOR] Update the javadoc of HoodieTableMetaClient#scanFiles ( #1263 )
...
[MINOR] Update the javadoc of HoodieTableMetaClient#scanFiles
2020-01-21 15:50:40 +08:00