1
0
Commit Graph

16 Commits

Author SHA1 Message Date
Nishith Agarwal
0eaa21111a Re-factoring Compaction as first level API in WriteClient similar to upsert/insert 2018-03-07 16:16:39 -08:00
Nishith Agarwal
5405a6287b Introducing HoodieLogFormat V2 with versioning support
- HoodieLogFormat V2 has support for LogFormat evolution through versioning
			- LogVersion is associated with a LogBlock not a LogFile
			- Based on a version for a LogBlock, approporiate code path is executed
		- Implemented LazyReading of Hoodie Log Blocks with Memory / IO tradeoff
		- Implemented Reverse pointer to be able to traverse the log in reverse
		- Introduce new MAGIC for backwards compatibility with logs without versions
2018-03-06 21:14:11 -08:00
Nishith Agarwal
30049383f5 Small File Size correction handling for MOR table type 2018-02-07 11:01:10 -08:00
Nishith Agarwal
2116815261 Fixing Rollback for compaction/commit operation, added check for null commit
- Fallback to old way of rollback by listing all partitions
	- Added null check to ensure only partitions which are to be rolledback are considered
	- Added location (committime) to workload stat
	- Added checks in CompactedScanner to guard against task retries
	- Introduce new logic for rollback (bounded by instant_time and target_instant time)
        - Reversed logfiles order
2018-02-06 16:55:23 -08:00
Jian Xu
363e35bb0f Add finalizeWrite support for HoodieMergeHandle 2018-02-02 11:38:25 -08:00
Jian Xu
37f2cdd7e4 Incorporating code review feedback for finalizeWrite for COW #2 2018-02-02 11:38:25 -08:00
Jian Xu
2fe4fef625 Incorporating code review feedback for finalizeWrite for COW 2018-02-02 11:38:25 -08:00
Jian Xu
c874248f23 Add FinalizeWrite in HoodieCreateHandle for COW tables 2018-02-02 11:38:25 -08:00
Vinoth Chandar
0cd186c899 Multi FS Support
- Reviving PR 191, to make FileSystem creation off actual path
 - Streamline all filesystem access to HoodieTableMetaClient
 - Hadoop Conf from Spark Context serialized & passed to executor code too
 - Pick up env vars prefixed with HOODIE_ENV_ into Configuration object
 - Cleanup usage of FSUtils.getFS, piggybacking off HoodieTableMetaClient.getFS
 - Adding s3a to supported schemes & support escaping "." in env vars
 - Tests use HoodieTestUtils.getDefaultHadoopConf
2018-01-17 23:34:21 -08:00
Nishith Agarwal
44839b88c6 Removing compaction action type and associated compaction timeline operations, replace with commit action type 2018-01-09 09:56:15 -08:00
Nishith Agarwal
9b610f82c7 Separating out compaction() API 2017-11-14 22:56:29 -08:00
Vinoth Chandar
e45679f5e2 Reformatting code per Google Code Style all over 2017-11-12 23:19:02 -08:00
Nishith Agarwal
c7d63a7622 1) Separated rollback as a table operation 2) Implement rollback for MOR 2017-10-12 07:36:46 -07:00
Prasanna Rajaperumal
91b088f29f Implement Compaction policy abstraction. Implement LogSizeBased Bounded IO Compaction as the default strategy 2017-04-20 16:59:06 -07:00
Prasanna Rajaperumal
aee136777b Fixes needed to run merge-on-read testing on production scale data 2017-04-02 22:25:47 -07:00
prazanna
eb46e7c72b Implement Merge on Read Storage (#76)
1. Create HoodieTable abstraction for commits and fileSystemView
2. HoodieMergeOnReadTable created
3. View is now always obtained from the table and the correct view based on the table type is returned
2017-02-21 16:24:38 -08:00