lanyuanxiaoyao/hudi - hudi - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
Thinking Chen	62ecb2da62	when column type is decimal, should add precision and scale (#753 )	2019-07-08 16:13:22 -07:00
Nishith Agarwal	129e433641	- Ugrading to Hive 2.x - Eliminating in-memory deltaRecordsMap - Use writerSchema to generate generic record needed by custom payloads - changes to make tests work with hive 2.x	2019-06-13 12:46:14 -07:00
Balaji Varadarajan	479908fd20	HUDI-125 : Change License for all source files and update RAT configurations	2019-06-09 11:41:55 -07:00
Balaji Varadarajan	30b0f2636f	Changes related to Licensing work 1. Go through dependencies list one round to ensure compliance. Generated current NOTICE list in all submodules (other apache projects like flink does this). To be on conservative side regarding licensing, NOTICE.txt lists all dependencies including transitive. Pending Compliance questions reported in https://issues.apache.org/jira/browse/LEGAL-461 2. Automate generating NOTICE.txt files to allow future package compliance issues be identified early as part of code-review process. 3. Added NOTICE.txt and LICENSE.txt to all HUDI jars	2019-06-07 17:58:57 -07:00
vinothchandar	66c0b81b49	[maven-release-plugin] prepare for next development iteration	2019-05-28 19:17:26 -07:00
vinothchandar	227785c022	[maven-release-plugin] prepare release hoodie-0.4.7	2019-05-28 19:17:15 -07:00
Balaji Varadarajan	145034c5fa	Spark Stage retry handling	2019-05-21 14:49:51 -07:00
vinothchandar	446f99aa0f	[maven-release-plugin] prepare for next development iteration	2019-05-14 07:29:22 -07:00
vinothchandar	cc38abecc8	[maven-release-plugin] prepare release hoodie-0.4.6	2019-05-14 07:29:11 -07:00
Balaji Varadarajan	194d904c99	run_hive_sync tool must be able to handle case where there are multiple standalone jdbc jars in hive installation dir	2019-03-21 09:58:20 -07:00
Balaji Varadarajan	adc8cac743	Fix hive sync (libfb version mismatch) and deltastreamer issue (missing cmdline argument) in demo	2019-03-13 16:14:32 -07:00
Vinoth Chandar	363df2c12e	Upgrade various jar, gem versions for maintenance	2019-03-01 10:14:00 -08:00
vinothchandar	687395e40f	[maven-release-plugin] prepare for next development iteration	2019-02-27 07:16:27 -08:00
vinothchandar	bbf40ef987	[maven-release-plugin] prepare release hoodie-0.4.5	2019-02-27 07:16:15 -08:00
Balaji Varadarajan	3a0044216c	New Features in DeltaStreamer : (1) Apply transformation when using delta-streamer to ingest data. (2) Add Hudi Incremental Source for Delta Streamer (3) Allow delta-streamer config-property to be passed as command-line (4) Add Hive Integration to Delta-Streamer and address Review comments (5) Ensure MultiPartKeysValueExtractor handle hive style partition description (6) Reuse same spark session on both source and transformer (7) Support extracting partition fields from _hoodie_partition_path for HoodieIncrSource (8) Reuse Binary Avro coders (9) Add push down filter for Incremental source (10) Add Hoodie DeltaStreamer metrics to track total time taken	2019-02-11 18:22:05 -08:00
Balaji Varadarajan	30c5f8b7bd	Ensure Hoodie works for non-partitioned Hive table	2018-12-12 13:35:16 -08:00
xubo245	466ff73ffb	fix some spell errorin Hudi	2018-12-12 13:06:25 -08:00
Balaji Varadarajan	f3418e4718	Docker Container Build and Run setup with foundations for adding docker integration tests. Docker images built with Hadoop 2.8.4 Hive 2.3.3 and Spark 2.3.1 and published to docker-hub Look at quickstart document for how to setup docker and run demo	2018-10-02 09:28:21 +05:30
vinothchandar	7ba842c0fe	[maven-release-plugin] prepare for next development iteration	2018-09-28 11:27:00 +05:30
vinothchandar	5847b61f44	[maven-release-plugin] prepare release hoodie-0.4.4	2018-09-28 11:26:15 +05:30
Balaji Varadarajan	4c74dd4cad	Travis CI tests needs to be run in quieter mode (WARN log level) to avoid max log-size errors	2018-09-26 21:10:20 +05:30
Balaji Varadarajan	460e24e84b	Hive Sync handling must work for datasets with multi-partition keys	2018-09-20 16:53:26 +05:30
Balaji Varadarajan	5cb28e7b1f	Explicitly release resources in LogFileReader and TestHoodieClientBase	2018-09-20 13:24:57 +05:30
Vinoth Chandar	bd5af89f12	[maven-release-plugin] rollback the release of hoodie-0.4.4	2018-09-13 15:01:53 +05:30
Vinoth Chandar	d1cc864a43	[maven-release-plugin] prepare for next development iteration	2018-09-12 23:59:47 +05:30
Vinoth Chandar	b748bc836d	[maven-release-plugin] prepare release hoodie-0.4.4	2018-09-12 23:59:34 +05:30
Vinoth Chandar	eca49a255e	Rebasing and fixing conflicts against master	2018-09-11 11:03:30 +05:30
Vinoth Chandar	a5359662be	Moving depedencies off cdh to apache + Hive2 support - Tests redone in the process - Main changes are to RealtimeRecordReader and how it treats maps/arrays - Make hive sync work with Hive 1/2 and CDH environments - Fixes to make corner cases for Hive queries - Spark Hive integration - Working version across Apache and CDH versions - Known Issue - https://github.com/uber/hudi/issues/439	2018-09-11 11:03:30 +05:30
Nishith Agarwal	459e523d9e	1. Small file size handling for inserts into log files. In summary, the total size of the log file is compared with the parquet max file size and if there is scope to add inserts the add it.	2018-09-06 08:52:08 +08:00
Vinoth Chandar	89cd6b0726	[maven-release-plugin] prepare for next development iteration	2018-08-22 21:30:05 -07:00
Vinoth Chandar	8d305c5a86	[maven-release-plugin] prepare release hoodie-0.4.3	2018-08-22 21:29:53 -07:00
Balaji Varadarajan	2e12c86d01	Ensure Compaction Operation compacts the data file as defined in the workload	2018-08-07 08:19:50 -07:00
Balaji Varadarajan	2f8ce93030	Async Compaction Main API changes	2018-08-07 08:19:50 -07:00
Vinoth Chandar	34827d50e1	[maven-release-plugin] prepare for next development iteration	2018-06-11 08:59:13 -07:00
Vinoth Chandar	43ef385730	[maven-release-plugin] prepare release hoodie-0.4.2	2018-06-11 08:59:02 -07:00
Balaji Varadarajan	788e4f2d2e	CodeStyle formatting to conform to basic Checkstyle rules. The code-style rules follow google style with some changes: 1. Increase line length from 100 to 120 2. Disable JavaDoc related checkstyles as this needs more manual work. Both source and test code are checked for code-style	2018-03-30 11:09:40 -07:00
Nishith Agarwal	9dff8c2326	Adding a tool to read/inspect a HoodieLogFile	2018-03-15 16:48:28 -07:00
Jian Xu	7f079632a6	Use hadoopConf in HoodieTableMetaClient and related tests	2018-03-12 11:47:55 -07:00
Vinoth Chandar	73534d467f	[maven-release-plugin] prepare for next development iteration	2018-03-07 21:04:10 -08:00
Vinoth Chandar	f2e5c6f9f8	[maven-release-plugin] prepare release hoodie-0.4.1	2018-03-07 21:04:00 -08:00
Nishith Agarwal	5405a6287b	Introducing HoodieLogFormat V2 with versioning support - HoodieLogFormat V2 has support for LogFormat evolution through versioning - LogVersion is associated with a LogBlock not a LogFile - Based on a version for a LogBlock, approporiate code path is executed - Implemented LazyReading of Hoodie Log Blocks with Memory / IO tradeoff - Implemented Reverse pointer to be able to traverse the log in reverse - Introduce new MAGIC for backwards compatibility with logs without versions	2018-03-06 21:14:11 -08:00
Vinoth Chandar	0cd186c899	Multi FS Support - Reviving PR 191, to make FileSystem creation off actual path - Streamline all filesystem access to HoodieTableMetaClient - Hadoop Conf from Spark Context serialized & passed to executor code too - Pick up env vars prefixed with HOODIE_ENV_ into Configuration object - Cleanup usage of FSUtils.getFS, piggybacking off HoodieTableMetaClient.getFS - Adding s3a to supported schemes & support escaping "." in env vars - Tests use HoodieTestUtils.getDefaultHadoopConf	2018-01-17 23:34:21 -08:00
Nishith Agarwal	44839b88c6	Removing compaction action type and associated compaction timeline operations, replace with commit action type	2018-01-09 09:56:15 -08:00
Nishith Agarwal	051f600b7f	Enable hive sync even if there is no compaction commit	2017-11-30 18:22:58 -08:00
Vinoth Chandar	e45679f5e2	Reformatting code per Google Code Style all over	2017-11-12 23:19:02 -08:00
Nishith Agarwal	abe964bebd	Implementing custom payload/merge hooks abstractions for application specific merge logic	2017-11-07 18:55:55 -08:00
Nishith Agarwal	c7d63a7622	1) Separated rollback as a table operation 2) Implement rollback for MOR	2017-10-12 07:36:46 -07:00
Vinoth Chandar	e1fe3ab937	[maven-release-plugin] prepare for next development iteration	2017-10-02 22:42:54 -07:00
Vinoth Chandar	50139fe904	[maven-release-plugin] prepare release hoodie-0.4.0	2017-10-02 22:42:32 -07:00
Nishith Agarwal	19c22b231e	1. Use HoodieLogFormat to archive commits and other actions 2. Introduced avro schema for commits and compactions and an avro wrapper schema	2017-07-26 14:27:44 -07:00

1 2

98 Commits