lanyuanxiaoyao/hudi - hudi - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
Vinoth Chandar	da17c5c607	Introduce getCommitsAndCompactionsTimeline() explicitly & adjust usage across code base	2017-05-01 21:48:27 -07:00
Vinoth Chandar	bae0528013	Cleanup calls to HoodieTimeline.compareTimeStamps	2017-05-01 21:48:27 -07:00
Vinoth Chandar	7b1446548f	Initial impl of HoodieRealtimeInputFormat - Works end-end for flat schemas - Schema evolution & hardening remains - HoodieClientExample can now write mor tables as well	2017-05-01 21:48:27 -07:00
Vinoth Chandar	9f526396a0	Add support for merge_on_read tables to HoodieClientExample	2017-05-01 21:48:27 -07:00
Prasanna Rajaperumal	7bca428a0a	Test to check if properties set are properly propogated	2017-04-28 12:47:14 -07:00
Prasanna Rajaperumal	3f97bdcccf	Test to check if properties set are properly propogated	2017-04-28 12:40:58 -07:00
Prasanna Rajaperumal	c3258039f0	[maven-release-plugin] prepare for next development iteration	2017-04-27 11:00:56 -07:00
Prasanna Rajaperumal	de1bdad756	[maven-release-plugin] prepare release hoodie-0.3.6	2017-04-27 11:00:45 -07:00
Prasanna Rajaperumal	8974e11161	Make sure properties set in HoodieWriteConfig is propogated down to individual configs. Fix a race condition which lets InputFormat to think file size is 0 when it is actually not	2017-04-27 10:52:25 -07:00
Prasanna Rajaperumal	91b088f29f	Implement Compaction policy abstraction. Implement LogSizeBased Bounded IO Compaction as the default strategy	2017-04-20 16:59:06 -07:00
Vinoth Chandar	82b211d2e6	Rebase with generic partition support	2017-04-03 21:27:49 -07:00
Vinoth Chandar	848814bece	Adding docs for deltastreamer, hivesync tool usage	2017-04-03 21:27:49 -07:00
Vinoth Chandar	542d622e49	Adding HiveSyncTool to sync hoodie dataset schema/partitions to Hive - Designed to be run by your workflow manager after hoodie upsert - Assumes jdbc connectivity via HiveServer2, which should work with all major distros	2017-04-03 21:27:49 -07:00
Vinoth Chandar	2b6322318c	CR feedback	2017-04-03 18:28:01 -07:00
Vinoth Chandar	e0fc4ec38e	Documentation update + helper method for WriteConfig builder	2017-04-03 18:28:01 -07:00
Vinoth Chandar	dce35ff0d7	Adding a config to control whether date partitioning can be assumed - false by default - CAUTION: If you have an existing tables without partition metadata, you need to set this to "true"	2017-04-03 18:28:01 -07:00
Vinoth Chandar	f9fd16069d	FSUtils.getAllPartitionsPaths() works based on .hoodie_partition_metadata - clean/rollback/write paths covered by existing tests - Snapshot copier fixed to copy metadata file also, and test fixed - Existing tables need to be repaired by addition of metadata, before this can be rolled out	2017-04-03 18:28:01 -07:00
Vinoth Chandar	3129770fd0	Create .hoodie_partition_metadata in each partition, linking back to basepath - Concurreny handled via taskID, failure recovery handled via renames - Falls back to search 3 levels up - Cli tool has command to add this to existing tables	2017-04-03 18:28:01 -07:00
Prasanna Rajaperumal	1e802ad4f2	Move HoodieAvroReader to hoodie-common, it will be used for compaction and in the record reader	2017-04-03 13:58:35 -07:00
Prasanna Rajaperumal	aee136777b	Fixes needed to run merge-on-read testing on production scale data	2017-04-02 22:25:47 -07:00
Prasanna Rajaperumal	57ab7a2405	[maven-release-plugin] prepare for next development iteration	2017-03-31 14:58:55 -07:00
Prasanna Rajaperumal	803c635098	[maven-release-plugin] prepare release hoodie-0.3.5	2017-03-31 14:58:46 -07:00
Prasanna Rajaperumal	f4bb44c1b1	Update snapshot version to 0.3.5-SNAPSHOT	2017-03-31 14:54:54 -07:00
Prasanna Rajaperumal	77e54e78f8	Create the partition path if it does not exist when listing data files in a partition	2017-03-28 05:20:15 -07:00
Yash Sharma	e3b273e9fd	formatting for docs	2017-03-28 05:08:54 -07:00
Yash Sharma	bca7e7dae4	improve documentations	2017-03-28 05:08:54 -07:00
Yash Sharma	d6f94b998d	Hoodie operability with S3	2017-03-28 05:08:54 -07:00
prazanna	a7cd021f26	Update incremental pull query documentation	2017-03-23 16:20:54 -07:00
prazanna	0e3f635adb	remove hardcoding of autoClean	2017-03-23 15:54:26 -07:00
Zeeshan Qureshi	a94f3a638e	Pass table path as argument to HoodieClientExample	2017-03-23 08:12:20 -07:00
fishie9	b7047ab4fb	Pass in String StroageLevel for WriteStatus (#113 )	2017-03-23 04:31:30 -07:00
ovj	b02910c588	few fixes to quick start document (#112 )	2017-03-22 18:25:26 -07:00
prazanna	f1b7afad21	Add config for index parallelism and make clean public (#109 ) * Add config for index parallelism and make clean public * Review comments on clean api modification	2017-03-21 17:36:46 -07:00
ovj	21898907c1	tool for importing hive tables (in parquet format) into hoodie dataset (#89 ) * tool for importing hive tables (in parquet format) into hoodie dataset * review fixes * review fixes * review fixes	2017-03-21 14:42:13 -07:00
prazanna	d835710c51	Metadata timeline marks an already complete instant as complete again (#98 )	2017-03-17 12:42:26 -07:00
Prasanna Rajaperumal	d83b671ada	Implement Savepoints and required metadata timeline - Part 2	2017-03-13 23:09:29 -07:00
prazanna	6f36e1eaaf	Implement Savepoints and required metadata timeline (#86 ) - Introduce avro to save clean metadata with details about the last commit that was retained - Save rollback metadata in the meta timeline - Create savepoint metadata and add API to createSavepoint, deleteSavepoint and rollbackToSavepoint - Savepointed commit should not be rolledback or cleaned or archived - introduce cli commands to show, create and rollback to savepoints - Write unit tests to test savepoints and rollbackToSavepoints	2017-03-13 15:12:03 -07:00
vinoth chandar	69d3950a32	Revamped Deltastreamer (#93 ) * Add analytics to site * Fix ugly favicon * New & Improved HoodieDeltaStreamer - Can incrementally consume from HDFS or Kafka, with exactly-once semantics! - Supports Json/Avro data, Source can also do custom things - Source is totally pluggable, via reflection - Key generation is pluggable, currently added SimpleKeyGenerator - Schema provider is pluggable, currently Filebased schemas - Configurable field to break ties during preCombine - Finally, can also plugin the HoodieRecordPayload, to get other merge types than overwriting - Handles efficient avro serialization in Spark Pending : - Rewriting of HiveIncrPullSource - Hive sync via hoodie-hive - Cleanup & tests * Minor fixes from master rebase * Implementation of HiveIncrPullSource - Copies commit by commit from source to target * Adding TimestampBasedKeyGenerator - Supports unix time & date strings	2017-03-13 12:41:29 -07:00
Vinoth Chandar	c3257b9680	Fix ugly favicon	2017-03-12 20:30:42 -07:00
Vinoth Chandar	b252633fab	Add analytics to site	2017-03-12 20:30:42 -07:00
prazanna	404726031d	Adding Siddhartha Gunda as a contributor for his contribution on the delete api	2017-03-04 01:36:21 -08:00
siddharthagunda	348a48aa80	Add delete support to Hoodie (#85 )	2017-03-04 01:33:49 -08:00
Prasanna Rajaperumal	41e08018fc	Fixing minor documentation fixes	2017-03-02 11:42:04 -08:00
Prasanna Rajaperumal	d84aea3512	Fixing minor documentation fixes	2017-03-02 11:39:40 -08:00
prazanna	8a2a9ae764	Making minor documentation fixes	2017-03-02 11:35:09 -08:00
vinoth chandar	116a78094f	Cleanup code based on Java8 Lambdas (#84 )	2017-02-27 15:52:13 -08:00
Wei Yan	c4fa585b27	Switch some info log to debug (#83 ) * Switch some info log to debug * fix a typo * remote HoodieTableMetadata file	2017-02-23 20:12:36 -08:00
Prasanna Rajaperumal	fe5c5e8021	Test Failure in Travis-ci	2017-02-21 20:25:01 -08:00
Prasanna Rajaperumal	1132f3533d	Merge and pull master commits	2017-02-21 17:53:28 -08:00
prazanna	eb46e7c72b	Implement Merge on Read Storage (#76 ) 1. Create HoodieTable abstraction for commits and fileSystemView 2. HoodieMergeOnReadTable created 3. View is now always obtained from the table and the correct view based on the table type is returned	2017-02-21 16:24:38 -08:00

1 2 3

141 Commits