gekath
db7311f85e
Writes relative paths to .commit files instead of absolute paths
...
Clean up code
Removed commented out code
Fixed merge conflict with master
2017-06-16 12:51:19 -07:00
Prasanna Rajaperumal
0ed3fac5e3
[maven-release-plugin] prepare for next development iteration
2017-06-16 11:03:17 -07:00
Prasanna Rajaperumal
45732e440c
[maven-release-plugin] prepare release hoodie-0.3.8
2017-06-16 10:59:58 -07:00
Kaushik Devarajaiah
521555c576
Parallelize file version deletes during clean and related tests
2017-06-15 18:20:42 -07:00
Prasanna Rajaperumal
dda28c0b4b
Rollback inflight commits as well when rolling back to savepoint
2017-06-14 11:03:27 -07:00
Prasanna Rajaperumal
db6150c5ef
Refactor hoodie-hive
2017-06-09 13:06:33 -07:00
Prasanna Rajaperumal
933cc8071f
[maven-release-plugin] prepare for next development iteration
2017-05-24 14:02:50 -07:00
Prasanna Rajaperumal
bebae06b5b
[maven-release-plugin] prepare release hoodie-0.3.7
2017-05-24 14:02:41 -07:00
Prasanna Rajaperumal
bae98efeee
Delete other instant files (.clean) as well during commit archival
2017-05-24 13:51:49 -07:00
Prasanna Rajaperumal
240c91241b
Implement HoodieLogFormat replacing Avro as the default log format
2017-05-23 08:35:11 -07:00
Nishith Agarwal
3c984447da
view scheme added
2017-05-22 12:27:40 -07:00
Prasanna Rajaperumal
70dd7a25ea
Clean should not create a .inflight file
2017-05-22 10:48:35 -07:00
Zeeshan Qureshi
43a55b09fd
Add GCS to supported filesystems
2017-05-18 10:30:34 -07:00
Vinoth Chandar
b4e787ce1d
Update docs
2017-05-01 21:48:27 -07:00
Vinoth Chandar
da17c5c607
Introduce getCommitsAndCompactionsTimeline() explicitly & adjust usage across code base
2017-05-01 21:48:27 -07:00
Vinoth Chandar
bae0528013
Cleanup calls to HoodieTimeline.compareTimeStamps
2017-05-01 21:48:27 -07:00
Vinoth Chandar
7b1446548f
Initial impl of HoodieRealtimeInputFormat
...
- Works end-end for flat schemas
- Schema evolution & hardening remains
- HoodieClientExample can now write mor tables as well
2017-05-01 21:48:27 -07:00
Vinoth Chandar
9f526396a0
Add support for merge_on_read tables to HoodieClientExample
2017-05-01 21:48:27 -07:00
Prasanna Rajaperumal
7bca428a0a
Test to check if properties set are properly propogated
2017-04-28 12:47:14 -07:00
Prasanna Rajaperumal
3f97bdcccf
Test to check if properties set are properly propogated
2017-04-28 12:40:58 -07:00
Prasanna Rajaperumal
c3258039f0
[maven-release-plugin] prepare for next development iteration
2017-04-27 11:00:56 -07:00
Prasanna Rajaperumal
de1bdad756
[maven-release-plugin] prepare release hoodie-0.3.6
2017-04-27 11:00:45 -07:00
Prasanna Rajaperumal
8974e11161
Make sure properties set in HoodieWriteConfig is propogated down to individual configs. Fix a race condition which lets InputFormat to think file size is 0 when it is actually not
2017-04-27 10:52:25 -07:00
Prasanna Rajaperumal
91b088f29f
Implement Compaction policy abstraction. Implement LogSizeBased Bounded IO Compaction as the default strategy
2017-04-20 16:59:06 -07:00
Vinoth Chandar
2b6322318c
CR feedback
2017-04-03 18:28:01 -07:00
Vinoth Chandar
e0fc4ec38e
Documentation update + helper method for WriteConfig builder
2017-04-03 18:28:01 -07:00
Vinoth Chandar
dce35ff0d7
Adding a config to control whether date partitioning can be assumed
...
- false by default
- CAUTION: If you have an existing tables without partition metadata, you need to set this to "true"
2017-04-03 18:28:01 -07:00
Vinoth Chandar
f9fd16069d
FSUtils.getAllPartitionsPaths() works based on .hoodie_partition_metadata
...
- clean/rollback/write paths covered by existing tests
- Snapshot copier fixed to copy metadata file also, and test fixed
- Existing tables need to be repaired by addition of metadata, before this can be rolled out
2017-04-03 18:28:01 -07:00
Vinoth Chandar
3129770fd0
Create .hoodie_partition_metadata in each partition, linking back to basepath
...
- Concurreny handled via taskID, failure recovery handled via renames
- Falls back to search 3 levels up
- Cli tool has command to add this to existing tables
2017-04-03 18:28:01 -07:00
Prasanna Rajaperumal
1e802ad4f2
Move HoodieAvroReader to hoodie-common, it will be used for compaction and in the record reader
2017-04-03 13:58:35 -07:00
Prasanna Rajaperumal
aee136777b
Fixes needed to run merge-on-read testing on production scale data
2017-04-02 22:25:47 -07:00
Prasanna Rajaperumal
57ab7a2405
[maven-release-plugin] prepare for next development iteration
2017-03-31 14:58:55 -07:00
Prasanna Rajaperumal
803c635098
[maven-release-plugin] prepare release hoodie-0.3.5
2017-03-31 14:58:46 -07:00
Prasanna Rajaperumal
f4bb44c1b1
Update snapshot version to 0.3.5-SNAPSHOT
2017-03-31 14:54:54 -07:00
Yash Sharma
bca7e7dae4
improve documentations
2017-03-28 05:08:54 -07:00
Yash Sharma
d6f94b998d
Hoodie operability with S3
2017-03-28 05:08:54 -07:00
prazanna
0e3f635adb
remove hardcoding of autoClean
2017-03-23 15:54:26 -07:00
Zeeshan Qureshi
a94f3a638e
Pass table path as argument to HoodieClientExample
2017-03-23 08:12:20 -07:00
fishie9
b7047ab4fb
Pass in String StroageLevel for WriteStatus ( #113 )
2017-03-23 04:31:30 -07:00
prazanna
f1b7afad21
Add config for index parallelism and make clean public ( #109 )
...
* Add config for index parallelism and make clean public
* Review comments on clean api modification
2017-03-21 17:36:46 -07:00
ovj
21898907c1
tool for importing hive tables (in parquet format) into hoodie dataset ( #89 )
...
* tool for importing hive tables (in parquet format) into hoodie dataset
* review fixes
* review fixes
* review fixes
2017-03-21 14:42:13 -07:00
prazanna
d835710c51
Metadata timeline marks an already complete instant as complete again ( #98 )
2017-03-17 12:42:26 -07:00
Prasanna Rajaperumal
d83b671ada
Implement Savepoints and required metadata timeline - Part 2
2017-03-13 23:09:29 -07:00
prazanna
6f36e1eaaf
Implement Savepoints and required metadata timeline ( #86 )
...
- Introduce avro to save clean metadata with details about the last commit that was retained
- Save rollback metadata in the meta timeline
- Create savepoint metadata and add API to createSavepoint, deleteSavepoint and rollbackToSavepoint
- Savepointed commit should not be rolledback or cleaned or archived
- introduce cli commands to show, create and rollback to savepoints
- Write unit tests to test savepoints and rollbackToSavepoints
2017-03-13 15:12:03 -07:00
vinoth chandar
69d3950a32
Revamped Deltastreamer ( #93 )
...
* Add analytics to site
* Fix ugly favicon
* New & Improved HoodieDeltaStreamer
- Can incrementally consume from HDFS or Kafka, with exactly-once semantics!
- Supports Json/Avro data, Source can also do custom things
- Source is totally pluggable, via reflection
- Key generation is pluggable, currently added SimpleKeyGenerator
- Schema provider is pluggable, currently Filebased schemas
- Configurable field to break ties during preCombine
- Finally, can also plugin the HoodieRecordPayload, to get other merge types than overwriting
- Handles efficient avro serialization in Spark
Pending :
- Rewriting of HiveIncrPullSource
- Hive sync via hoodie-hive
- Cleanup & tests
* Minor fixes from master rebase
* Implementation of HiveIncrPullSource
- Copies commit by commit from source to target
* Adding TimestampBasedKeyGenerator
- Supports unix time & date strings
2017-03-13 12:41:29 -07:00
siddharthagunda
348a48aa80
Add delete support to Hoodie ( #85 )
2017-03-04 01:33:49 -08:00
vinoth chandar
116a78094f
Cleanup code based on Java8 Lambdas ( #84 )
2017-02-27 15:52:13 -08:00
Prasanna Rajaperumal
1132f3533d
Merge and pull master commits
2017-02-21 17:53:28 -08:00
prazanna
eb46e7c72b
Implement Merge on Read Storage ( #76 )
...
1. Create HoodieTable abstraction for commits and fileSystemView
2. HoodieMergeOnReadTable created
3. View is now always obtained from the table and the correct view based on the table type is returned
2017-02-21 16:24:38 -08:00
prazanna
11d2fd3428
Introduce RealtimeTableView and Implement HoodieRealtimeTableCompactor ( #73 )
2017-02-21 16:24:18 -08:00