lanyuanxiaoyao/hudi - hudi - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
vinothchandar	7ba842c0fe	[maven-release-plugin] prepare for next development iteration	2018-09-28 11:27:00 +05:30
vinothchandar	5847b61f44	[maven-release-plugin] prepare release hoodie-0.4.4	2018-09-28 11:26:15 +05:30
vinothchandar	9ca6f91e97	Perform consistency checks during write finalize - Check to ensure written files are listable on storage - Docs reflected to capture how this helps with s3 storage - Unit tests added, corrections to existing tests - Fix DeltaStreamer to manage archived commits in a separate folder	2018-09-28 08:04:41 +05:30
Balaji Varadarajan	4c74dd4cad	Travis CI tests needs to be run in quieter mode (WARN log level) to avoid max log-size errors	2018-09-26 21:10:20 +05:30
Yishuang Lu	faf93b6340	Fix the name of avro schema file in Test Fixed the name of avro schema file in Test Signed-off-by: Yishuang Lu <luystu@gmail.com>	2018-09-24 21:58:34 +05:30
Vinoth Chandar	bd5af89f12	[maven-release-plugin] rollback the release of hoodie-0.4.4	2018-09-13 15:01:53 +05:30
Vinoth Chandar	d1cc864a43	[maven-release-plugin] prepare for next development iteration	2018-09-12 23:59:47 +05:30
Vinoth Chandar	b748bc836d	[maven-release-plugin] prepare release hoodie-0.4.4	2018-09-12 23:59:34 +05:30
Vinoth Chandar	a5359662be	Moving depedencies off cdh to apache + Hive2 support - Tests redone in the process - Main changes are to RealtimeRecordReader and how it treats maps/arrays - Make hive sync work with Hive 1/2 and CDH environments - Fixes to make corner cases for Hive queries - Spark Hive integration - Working version across Apache and CDH versions - Known Issue - https://github.com/uber/hudi/issues/439	2018-09-11 11:03:30 +05:30
Vinoth Chandar	d58ddbd999	Reworking the deltastreamer tool - Standardize version of jackson - DFSPropertiesConfiguration replaces usage of commons PropertiesConfiguration - Remove dependency on ConstructorUtils - Throw error if ordering value is not present, during key generation - Switch to shade plugin for hoodie-utilities - Added support for consumption for Confluent avro kafka serdes - Support for Confluent schema registry - KafkaSource now deals with skews nicely, by doing round robin allocation of source limit across partitions - Added support for BULK_INSERT operations as well - Pass in the payload class config properly into HoodieWriteClient - Fix documentation based on new usage - Adding tests on deltastreamer, sources and all new util classes.	2018-09-08 10:24:32 +08:00
Nishith Agarwal	324de298bc	Removing dependency on apache-commons lang 3, adding necessary classes as needed	2018-09-06 08:26:48 +08:00
Saravanan Elumalai	2eaa42abde	Updated jcommander version to fix NPE in HoodieDeltaStreamer tool	2018-08-31 07:28:13 -07:00
Vinoth Chandar	89cd6b0726	[maven-release-plugin] prepare for next development iteration	2018-08-22 21:30:05 -07:00
Vinoth Chandar	8d305c5a86	[maven-release-plugin] prepare release hoodie-0.4.3	2018-08-22 21:29:53 -07:00
Vinoth Chandar	34827d50e1	[maven-release-plugin] prepare for next development iteration	2018-06-11 08:59:13 -07:00
Vinoth Chandar	43ef385730	[maven-release-plugin] prepare release hoodie-0.4.2	2018-06-11 08:59:02 -07:00
Balaji Varadarajan	788e4f2d2e	CodeStyle formatting to conform to basic Checkstyle rules. The code-style rules follow google style with some changes: 1. Increase line length from 100 to 120 2. Disable JavaDoc related checkstyles as this needs more manual work. Both source and test code are checked for code-style	2018-03-30 11:09:40 -07:00
Vinoth Chandar	73534d467f	[maven-release-plugin] prepare for next development iteration	2018-03-07 21:04:10 -08:00
Vinoth Chandar	f2e5c6f9f8	[maven-release-plugin] prepare release hoodie-0.4.1	2018-03-07 21:04:00 -08:00
Vinoth Chandar	e45679f5e2	Reformatting code per Google Code Style all over	2017-11-12 23:19:02 -08:00
Vinoth Chandar	e1fe3ab937	[maven-release-plugin] prepare for next development iteration	2017-10-02 22:42:54 -07:00
Vinoth Chandar	50139fe904	[maven-release-plugin] prepare release hoodie-0.4.0	2017-10-02 22:42:32 -07:00
Vinoth Chandar	64e0573aca	Adding hoodie-spark to support Spark Datasource for Hoodie - Write with COW/MOR paths work fully - Read with RO view works on both storages* - Incremental view supported on COW - Refactored out HoodieReadClient methods, to just contain key based access - HoodieDataSourceHelpers class can be now used to construct inputs to datasource - Tests in hoodie-client using new helpers and mechanisms - Basic tests around save modes & insert/upserts (more to follow) - Bumped up scala to 2.11, since 2.10 is deprecated & complains with scalatest - Updated documentation to describe usage - New sample app written using the DataSource API	2017-10-02 20:44:53 -07:00
Prasanna Rajaperumal	7d3963b4ab	Pushing master to 0.4.0 as we continue to make minor releases over 0.3.8 (MVP for MOR)	2017-06-30 11:41:23 -07:00
Nishith Agarwal	3eba812a1b	[maven-release-plugin] prepare for next development iteration	2017-06-30 11:17:07 -07:00
Nishith Agarwal	06d44daea3	[maven-release-plugin] prepare release hoodie-0.3.9	2017-06-30 11:16:58 -07:00
prazanna	b0a2a23372	Adding Nishith to Contributors list	2017-06-20 15:48:43 -07:00
prazanna	649475c5cb	Adding Kaushik to contributors list	2017-06-20 15:47:05 -07:00
prazanna	7ef76a4de0	Adding Kathy Ge to the contributors list	2017-06-16 12:52:54 -07:00
Prasanna Rajaperumal	0ed3fac5e3	[maven-release-plugin] prepare for next development iteration	2017-06-16 11:03:17 -07:00
Prasanna Rajaperumal	45732e440c	[maven-release-plugin] prepare release hoodie-0.3.8	2017-06-16 10:59:58 -07:00
Prasanna Rajaperumal	e44f9b889b	Added CHANGELOG.md and updated community contributions guideline	2017-06-16 10:48:37 -07:00
Prasanna Rajaperumal	db6150c5ef	Refactor hoodie-hive	2017-06-09 13:06:33 -07:00
Prasanna Rajaperumal	933cc8071f	[maven-release-plugin] prepare for next development iteration	2017-05-24 14:02:50 -07:00
Prasanna Rajaperumal	bebae06b5b	[maven-release-plugin] prepare release hoodie-0.3.7	2017-05-24 14:02:41 -07:00
prazanna	e1d13f2bc8	https://repository.cloudera.com/artifactory/repo/ has been changed to https://repository.cloudera.com/artifactory/public/	2017-05-23 12:05:01 -07:00
Vinoth Chandar	7014670795	Update contributor list	2017-05-18 10:48:42 -07:00
Vinoth Chandar	da17c5c607	Introduce getCommitsAndCompactionsTimeline() explicitly & adjust usage across code base	2017-05-01 21:48:27 -07:00
Prasanna Rajaperumal	c3258039f0	[maven-release-plugin] prepare for next development iteration	2017-04-27 11:00:56 -07:00
Prasanna Rajaperumal	de1bdad756	[maven-release-plugin] prepare release hoodie-0.3.6	2017-04-27 11:00:45 -07:00
Prasanna Rajaperumal	aee136777b	Fixes needed to run merge-on-read testing on production scale data	2017-04-02 22:25:47 -07:00
Prasanna Rajaperumal	57ab7a2405	[maven-release-plugin] prepare for next development iteration	2017-03-31 14:58:55 -07:00
Prasanna Rajaperumal	803c635098	[maven-release-plugin] prepare release hoodie-0.3.5	2017-03-31 14:58:46 -07:00
Prasanna Rajaperumal	f4bb44c1b1	Update snapshot version to 0.3.5-SNAPSHOT	2017-03-31 14:54:54 -07:00
ovj	21898907c1	tool for importing hive tables (in parquet format) into hoodie dataset (#89 ) * tool for importing hive tables (in parquet format) into hoodie dataset * review fixes * review fixes * review fixes	2017-03-21 14:42:13 -07:00
prazanna	6f36e1eaaf	Implement Savepoints and required metadata timeline (#86 ) - Introduce avro to save clean metadata with details about the last commit that was retained - Save rollback metadata in the meta timeline - Create savepoint metadata and add API to createSavepoint, deleteSavepoint and rollbackToSavepoint - Savepointed commit should not be rolledback or cleaned or archived - introduce cli commands to show, create and rollback to savepoints - Write unit tests to test savepoints and rollbackToSavepoints	2017-03-13 15:12:03 -07:00
prazanna	404726031d	Adding Siddhartha Gunda as a contributor for his contribution on the delete api	2017-03-04 01:36:21 -08:00
Prasanna Rajaperumal	48fbb0f425	Implement reliable log file management for Merge on read, which is fault tolerant and allows random block level access on avro file	2017-02-21 16:23:53 -08:00
Prasanna Rajaperumal	8ee777a9bb	Refactor hoodie-common and create right abstractions for Hoodie Storage V2.0 The following is the gist of changes done - All low-level operation of creating a commit code was in HoodieClient which made it hard to share code if there was a compaction commit. - HoodieTableMetadata contained a mix of metadata and filtering files. (Also few operations required FileSystem to be passed in because those were called from TaskExecutors and others had FileSystem as a global variable). Since merge-on-read requires a lot of that code, but will have to change slightly on how it operates on the metadata and how it filters the files. The two set of operation are split into HoodieTableMetaClient and TableFileSystemView. - Everything (active commits, archived commits, cleaner log, save point log and in future delta and compaction commits) in HoodieTableMetaClient is a HoodieTimeline. Timeline is a series of instants, which has an in-built concept of inflight and completed commit markers. - A timeline can be queries for ranges, contains and also use to create new datapoint (create a new commit etc). Commit (and all the above metadata) creation/deletion is streamlined in a timeline - Multiple timelines can be merged into a single timeline, giving us an audit timeline to whatever happened in a hoodie dataset. This also helps with #55. - Move to java 8 and introduce java 8 succinct syntax in refactored code	2017-02-21 16:23:53 -08:00
Prasanna Rajaperumal	283269e57f	[maven-release-plugin] prepare for next development iteration	2017-02-20 16:52:25 -08:00

... 2 3 4 5 6

284 Commits