lanyuanxiaoyao/hudi - hudi - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
lamber-ken	11fb2c2614	[HUDI-580] Fix incorrect license header in files	2020-02-25 08:54:26 -08:00
YanJia-Gary-Li	4e7fcde4a6	[HUDI-597] Enable incremental pulling from defined partitions (#1348 )	2020-02-24 11:46:30 -08:00
Suneel Marthi	f9d2f66dc1	[HUDI-622]: Remove VisibleForTesting annotation and import from code (#1343 ) * HUDI:622: Remove VisibleForTesting annotation and import from code	2020-02-20 15:17:53 +08:00
lamber-ken	425e3e6c78	[HUDI-585] Optimize the steps of building with scala-2.12 (#1293 )	2020-02-05 23:13:10 +08:00
Suneel Marthi	5b7bb142dc	[HUDI-583] Code Cleanup, remove redundant code, and other changes (#1237 )	2020-02-02 18:03:44 +08:00
leesf	652224edc8	[HUDI-578] Trim recordKeyFields and partitionPathFields in ComplexKeyGenerator (#1281 ) * [HUDI-578] Trim recordKeyFields and partitionPathFields in ComplexKeyGenerator * add tests	2020-01-29 16:26:26 -08:00
leesf	6e59c1c777	Moving to 0.5.2-SNAPSHOT on master branch.	2020-01-20 10:51:33 -08:00
Y Ethan Guo	d0ee95ed16	[HUDI-552] Fix the schema mismatch in Row-to-Avro conversion (#1246 )	2020-01-18 16:40:56 -08:00
wenningd	292c1e2ff4	[HUDI-238] Make Hudi support Scala 2.12 (#1226 ) * [HUDI-238] Rename scala related artifactId & add maven profile to support Scala 2.12	2020-01-17 14:02:21 -08:00
Prashant Wason	0a07752dc0	[HUDI-527] scalastyle-maven-plugin moved to pluginManagement as it is only used in hoodie-spark and hoodie-cli modules. This fixes compile warnings as well as unnecessary plugin invocation for most of the modules which do not have scala code.	2020-01-17 10:46:10 -08:00
vinoth chandar	c2c0f6b13d	[HUDI-509] Renaming code in sync with cWiki restructuring (#1212 ) - Storage Type replaced with Table Type (remaining instances) - View types replaced with query types; - ReadOptimized view referred as Snapshot Query - TableFileSystemView sub interfaces renamed to BaseFileOnly and Slice Views - HoodieDataFile renamed to HoodieBaseFile - Hive Sync tool will register RO tables for MOR with a `_ro` suffix - Datasource/Deltastreamer options renamed accordingly - Support fallback to old config values as well, so migration is painless - Config for controlling _ro suffix addition - Renaming DataFile to BaseFile across DTOs, HoodieFileSlice and AbstractTableFileSystemView	2020-01-16 23:58:47 -08:00
Scheller	1daba24065	Add GlobalDeleteKeyGenerator Adds new GlobalDeleteKeyGenerator for record_key deletes with global indices. Also refactors key generators into their own package.	2020-01-15 17:01:29 -08:00
Sivabalan Narayanan	2248fd9aea	Fixing checkstyle issues	2020-01-15 14:21:26 -08:00
Sivabalan Narayanan	2b2f23aa60	Fixing delete util method	2020-01-15 14:21:26 -08:00
Sivabalan Narayanan	87fdb769f0	Adding util methods to assist in adding deletion support to Quick Start	2020-01-15 14:21:26 -08:00
Mehrotra	2bb0c21a3d	Fix conversion of Spark struct type to Avro schema cr https://code.amazon.com/reviews/CR-17184364	2020-01-14 00:27:56 -08:00
Udit Mehrotra	ad50008a59	[HUDI-91][HUDI-12]Migrate to spark 2.4.4, migrate to spark-avro library instead of databricks-avro, add support for Decimal/Date types - Upgrade Spark to 2.4.4, Parquet to 1.10.1, Avro to 1.8.2 - Remove spark-avro from hudi-spark-bundle. Users need to provide --packages org.apache.spark:spark-avro:2.4.4 when running spark-shell or spark-submit - Replace com.databricks:spark-avro with org.apache.spark:spark-avro - Shade avro in hudi-hadoop-mr-bundle to make sure it does not conflict with hive's avro version.	2020-01-12 15:03:11 -08:00
lamber-ken	d9675c4ec0	[HUDI-522] Use the same version jcommander uniformly (#1214 )	2020-01-12 10:48:52 -08:00
pratyakshsharma	3c90d252cc	[HUDI-114]: added option to overwrite payload implementation in hoodie.properties file	2020-01-09 22:34:40 -08:00
Y Ethan Guo	480fc7869d	[HUDI-319] Add a new maven profile to generate unified Javadoc for all Java and Scala classes (#1195 ) * Add javadoc build command in README, links to javadoc plugin and rename profile. * Make java version configurable in one place.	2020-01-08 10:38:09 -08:00
vinoth chandar	9706f659db	[HUDI-508] Standardizing on "Table" instead of "Dataset" across code (#1197 ) - Docs were talking about storage types before, cWiki moved to "Table" - Most of code already has HoodieTable, HoodieTableMetaClient - correct naming - Replacing renaming use of dataset across code/comments - Few usages in comments and use of Spark SQL DataSet remain unscathed	2020-01-07 12:52:32 -08:00
lamber-ken	75c3f630d4	[HUDI-405] Remove HIVE_ASSUME_DATE_PARTITION_OPT_KEY config from DataSource	2020-01-06 14:25:38 -08:00
Pratyaksh Sharma	8f935e779a	[HUDI-406]: added default partition path in TimestampBasedKeyGenerator	2020-01-06 09:38:06 -08:00
hongdd	2d5b79d96f	[HUDI-438] Merge duplicated code fragment in HoodieSparkSqlWriter (#1114 )	2020-01-06 22:51:22 +08:00
Sivabalan Narayanan	7031445eb3	[HUDI-377] Adding Delete() support to DeltaStreamer (#1073 ) - Provides ability to perform hard deletes by writing delete marker records into the source data - if the record contains a special field _hoodie_delete_marker set to true, deletes are performed	2020-01-04 11:07:31 -08:00
Pratyaksh Sharma	dde21e7315	[HUDI-402]: code clean up in test cases	2019-12-31 11:10:49 -08:00
vinoth chandar	350b0ecb4d	[HUDI-311] : Support for AWS Database Migration Service in DeltaStreamer - Add a transformer class, that adds `Op` fiels if not found in input frame - Add a payload implementation, that issues deletes when Op=D - Remove Parquet as a top level source type, consolidate with RowSource - Made delta streamer work without a property file, simply using overridden cli options - Unit tests for transformer/payload classes	2019-12-23 20:56:55 -08:00
lamber-ken	313fab5fd1	[HUDI-444] Refactor the codes based on scala codestyle ReturnChecker rule (#1121 )	2019-12-24 07:05:54 +08:00
YanJia-Gary-Li	36b3b6f5dd	[HUDI-415] Get commit time when Spark start (#1113 )	2019-12-19 22:19:06 -08:00
lamber-ken	a405d3873b	[MINOR] replace scala map add operator (#1093 ) replace ++: with ++	2019-12-12 11:29:17 +08:00
lamber-ken	ba514cfea0	[MINOR] Remove redundant plus operator (#1097 )	2019-12-12 05:42:05 +08:00
lamber-ken	d447e2d751	[checkstyle] Unify LOG form (#1092 )	2019-12-10 19:23:38 +08:00
Wenning Ding	e555aa516d	[HUDI-353] Add hive style partitioning path	2019-12-09 12:29:53 -08:00
lamber-ken	2745b7552f	[HUDI-379] Refactor the codes based on new JavadocStyle code style rule (#1079 )	2019-12-06 12:59:28 +08:00
hongdd	b65a897856	[HUDI-374] Unable to generateUpdates in QuickstartUtils (#1059 )	2019-11-30 11:11:00 -08:00
lamber-ken	024230fbd2	[HUDI-372] Support the shortName for Hudi DataSource (#1054 ) - Ability to do `spark.write.format("hudi")...`	2019-11-30 08:02:33 -08:00
谢磊	f9139c0f61	[HUDI-366] Refactor some module codes based on new ImportOrder code style rule (#1055 ) [HUDI-366] Refactor hudi-hadoop-mr / hudi-timeline-service / hudi-spark / hudi-integ-test / hudi- utilities based on new ImportOrder code style rule	2019-11-27 21:32:43 +08:00
bschell	60fed21dc7	[HUDI-327] Add null/empty checks to key generators (#1040 ) * Adds null and empty checks to all key generators. * Also improves error messaging for key generator issues.	2019-11-26 02:37:16 -08:00
filippo balicchia	845a0509b3	[MINOR] Some minor optimizations in HoodieJavaStreamingApp (#1046 )	2019-11-25 18:49:13 +08:00
Sivabalan Narayanan	c3355109b1	[HUDI-328] Adding delete api to HoodieWriteClient (#1004 ) [HUDI-328] Adding delete api to HoodieWriteClient and Spark DataSource	2019-11-22 15:05:25 -08:00
hongdd	7bc08cbfdc	[HUDI-345] Fix used deprecated function (#1024 ) - Schema.parse() with new Schema.Parser().parse - FSDataOutputStream constructor	2019-11-22 03:32:09 -08:00
谢磊	804e348d0e	[HUDI-346] Set allowMultipleEmptyLines to false for EmptyLineSeparator rule (#1025 )	2019-11-19 18:44:42 +08:00
vinoth chandar	e4c91ed13f	[HUDI-290] Normalize test class name of all test classes (#951 )	2019-10-22 20:19:11 -07:00
Balaji Varadarajan	77f4e73615	[HUDI-121] Fix licensing issues found during RC voting by general incubator group	2019-10-16 02:09:02 -07:00
leesf	b19bed442d	[HUDI-296] Explore use of spotless to auto fix formatting errors (#945 ) - Add spotless format fixing to project - One time reformatting for conformity - Build fails for formatting changes and mvn spotless:apply autofixes them	2019-10-10 05:19:40 -07:00
Balaji Varadarajan	9b66ea41fd	[HUDI-121] Remove leftover notice file and replace com.uber.hoodie with org.apache.hudi in log4j properties	2019-10-04 09:18:57 -07:00
Balaji Varadarajan	6da2f9ac7c	[HUDI-287] Address comments during review of release candidate 1. Remove LICENSE and NOTICE files in hoodie child modules. 2. Remove developers and contributor section from pom 3. Also ensure any failures in validation script is reported appropriately 4. Make hoodie parent pom consistent with that of its parent apache-21 (https://github.com/apache/maven-apache-parent/blob/apache-21/pom.xml)	2019-10-03 09:00:07 -07:00
Balaji Varadarajan	6e8a28bcae	HUDI-121 : Address comments during RC2 voting 1. Remove dnl utils jar from git 2. Add LICENSE Headers in missing files 3. Fix NOTICE and LICENSE in all HUDI packages and in top-level 4. Fix License wording in certain HUDI source files 5. Include non java/scala code in RAT licensing check 6. Use whitelist to include dependencies as part of timeline-server bundling	2019-09-30 15:42:15 -07:00
Bhavani Sudha Saktheeswaran	50a073ff57	[HUDI-271] Create QuickstartUtils for simplifying quickstart guide - This will be used in Quickstart guide (Doc changes to follow in a seperate PR). The intention is to simplify quickstart to showcase hudi APIs by writing and reading using spark datasources. - This is located in hudi-spark module intentionally to bring all the necessary classes in hudi-spark-bundle finally.	2019-09-30 15:22:18 -07:00
Vinoth Chandar	e217db56ab	[HUDI-254]: Bundle and shade databricks/avro with spark bundle - spark 2.4 onwards, spark has built in support. shading to avoid conflicts - spark 2.3 still needs this bundled, so that dropping bundle into jars folder would work	2019-09-17 12:38:51 -07:00

1 2

60 Commits