lanyuanxiaoyao/hudi - hudi - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
Sagar Sumit	ed106f671e	[HUDI-2809] Introduce a checksum mechanism for validating hoodie.properties (#4712 ) Fix dependency conflict Fix repairs command Implement putIfAbsent for DDB lock provider Add upgrade step and validate while fetching configs Validate checksum for latest table version only while fetching config Move generateChecksum to BinaryUtil Rebase and resolve conflict Fix table version check	2022-02-18 10:17:06 +05:30
yuzhao.cyz	a1d0ff4209	Moving to 0.11.0-SNAPSHOT on master branch.	2021-11-27 17:22:10 +08:00
vinoyang	b1c4acf0ae	[HUDI-2614] Remove duplicated hadoop-hdfs with tests classifier exists in bundles (#3864 )	2021-10-26 22:36:10 +08:00
Udit Mehrotra	3e301196bf	Moving to 0.10.0-SNAPSHOT on master branch.	2021-08-14 18:51:09 -07:00
Sivabalan Narayanan	d58a8348dc	[HUDI-2007] Fixing hudi_test_suite for spark nodes and adding spark bulk_insert node (#3074 )	2021-07-21 00:11:01 -04:00
pengzhiwei	f760ec543e	[HUDI-1659] Basic Implement Of Spark Sql Support For Hoodie (#2645 ) Main functions: Support create table for hoodie. Support CTAS. Support Insert for hoodie. Including dynamic partition and static partition insert. Support MergeInto for hoodie. Support DELETE Support UPDATE Both support spark2 & spark3 based on DataSourceV1. Main changes: Add sql parser for spark2. Add HoodieAnalysis for sql resolve and logical plan rewrite. Add commands implementation for CREATE TABLE、INSERT、MERGE INTO & CTAS. In order to push down the update&insert logical to the HoodieRecordPayload for MergeInto, I make same change to the HoodieWriteHandler and other related classes. 1、Add the inputSchema for parser the incoming record. This is because the inputSchema for MergeInto is different from writeSchema as there are some transforms in the update& insert expression. 2、Add WRITE_SCHEMA to HoodieWriteConfig to pass the write schema for merge into. 3、Pass properties to HoodieRecordPayload#getInsertValue to pass the insert expression and table schema. Verify this pull request Add TestCreateTable for test create hoodie tables and CTAS. Add TestInsertTable for test insert hoodie tables. Add TestMergeIntoTable for test merge hoodie tables. Add TestUpdateTable for test update hoodie tables. Add TestDeleteTable for test delete hoodie tables. Add TestSqlStatement for test supported ddl/dml currently.	2021-06-07 23:24:32 -07:00
garyli1019	6e803e08b1	Moving to 0.9.0-SNAPSHOT on master branch.	2021-03-24 21:37:14 +08:00
n3nash	74241947c1	[HUDI-845] Added locking capability to allow multiple writers (#2374 ) * [HUDI-845] Added locking capability to allow multiple writers 1. Added LockProvider API for pluggable lock methodologies 2. Added Resolution Strategy API to allow for pluggable conflict resolution 3. Added TableService client API to schedule table services 4. Added Transaction Manager for wrapping actions within transactions	2021-03-16 16:43:53 -07:00
Sivabalan Narayanan	d5f202821b	Adding fixes to test suite framework. Adding clustering node and validate async operations node. (#2400 )	2021-02-12 09:29:21 -08:00
Vinoth Chandar	3719e7b388	Moving to 0.8.0-SNAPSHOT on master branch.	2021-01-20 11:31:22 -08:00
Sivabalan Narayanan	a43e191d6c	[MINOR] Bumping snapshot version to 0.7.0 (#2435 )	2021-01-16 09:56:28 -05:00
wenningd	fce1453fa6	[HUDI-1040] Make Hudi support Spark 3 (#2208 ) * Fix flaky MOR unit test * Update Spark APIs to make it be compatible with both spark2 & spark3 * Refactor bulk insert v2 part to make Hudi be able to compile with Spark3 * Add spark3 profile to handle fasterxml & spark version * Create hudi-spark-common module & refactor hudi-spark related modules Co-authored-by: Wenning Ding <wenningd@amazon.com>	2020-12-09 15:52:23 -08:00
Mathieu	1f7add9291	[HUDI-1089] Refactor hudi-client to support multi-engine (#1827 ) - This change breaks `hudi-client` into `hudi-client-common` and `hudi-spark-client` modules - Simple usages of Spark using jsc.parallelize() has been redone using EngineContext#map, EngineContext#flatMap etc - Code changes in the PR, break classes into `BaseXYZ` parent classes with no spark dependencies living in `hudi-client-common` - Classes on `hudi-spark-client` are named `SparkXYZ` extending the parent classes with all the Spark dependencies - To simplify/cleanup, HoodieIndex#fetchRecordLocation has been removed and its usages in tests replaced with alternatives Co-authored-by: Vinoth Chandar <vinoth@apache.org>	2020-10-01 14:25:29 -07:00
Bhavani Sudha Saktheeswaran	4226d75144	Moving to 0.6.1-SNAPSHOT on master branch.	2020-08-14 12:54:15 -07:00
Nishith Agarwal	2fc2b01d86	[HUDI-394] Provide a basic implementation of test suite	2020-07-30 21:21:15 -07:00
Raymond Xu	31247e9b34	[HUDI-896] Report test coverage by modules & parallelize CI (#1753 ) - use codecov flags for each module to report coverage - parallelize CI jobs for shorter time - add a testcase for MetricsReporterFactory (to trigger codecov comment)	2020-06-27 23:16:12 -07:00
Joey	2600d2de8d	[MINOR] Fix apache-rat violations (#1639 ) * MINOR Fix apache-rat violations. Also, enabling RAT for hudi-utilities and hudi-integ-test	2020-05-18 11:16:49 -07:00
Raymond Xu	d65efe659d	[HUDI-780] Migrate test cases to Junit 5 (#1504 )	2020-04-15 12:35:01 -07:00
Suneel Marthi	99b7e9eb9e	[HUDI-629]: Replace Guava's Hashing with an equivalent in NumericUtils.java (#1350 ) * [HUDI-629]: Replace Guava's Hashing with an equivalent in NumericUtils.java	2020-03-13 20:28:05 -04:00
yanghua	0dc8e493aa	Moving to 0.6.0-SNAPSHOT on master branch.	2020-03-01 15:08:30 +08:00
lamber-ken	cdb028f1f3	[MINOR] Fix missing groupId / version property of dependency	2020-01-25 09:19:55 -08:00
leesf	6e59c1c777	Moving to 0.5.2-SNAPSHOT on master branch.	2020-01-20 10:51:33 -08:00
wenningd	292c1e2ff4	[HUDI-238] Make Hudi support Scala 2.12 (#1226 ) * [HUDI-238] Rename scala related artifactId & add maven profile to support Scala 2.12	2020-01-17 14:02:21 -08:00
Mehrotra	92c69f5703	Migrate integration tests to spark 2.4.4	2019-11-13 16:53:24 -08:00
leesf	b19bed442d	[HUDI-296] Explore use of spotless to auto fix formatting errors (#945 ) - Add spotless format fixing to project - One time reformatting for conformity - Build fails for formatting changes and mvn spotless:apply autofixes them	2019-10-10 05:19:40 -07:00
Balaji Varadarajan	9b66ea41fd	[HUDI-121] Remove leftover notice file and replace com.uber.hoodie with org.apache.hudi in log4j properties	2019-10-04 09:18:57 -07:00
Balaji Varadarajan	c1e7d0e5a6	[HUDI-121] Update Release notes and fix master version	2019-09-17 09:50:30 -07:00
Balaji Varadarajan	d2525c31b7	Moving to 0.6.0-SNAPSHOT on master branch.	2019-09-13 09:58:29 -07:00
Balaji Varadarajan	58623631d4	[HUDI-249] Update Release-notes. Add sign-artifacts to POM and release related scripts. Add missing license headers	2019-09-13 08:41:29 -07:00
vinoth chandar	7a973a6944	[HUDI-159] Redesigning bundles for lighter-weight integrations - Documented principles applied for redesign at packaging/README.md - No longer depends on incl commons-codec, commons-io, commons-pool, commons-dbcp, commons-lang, commons-logging, avro-mapred - Introduce new FileIOUtils & added checkstyle rule for illegal import of above - Parquet, Avro dependencies moved to provided scope to enable being picked up from Hive/Spark/Presto instead - Pickup jackson jars for Hive sync tool from HIVE_HOME & unbundling jackson everywhere - Remove hive-jdbc standalone jar from being bundled in Spark/Hive/Utilities bundles - 6.5x reduced number of classes across bundles	2019-09-11 11:08:27 -07:00
Vinoth Chandar	78e0721507	[HUDI-159] Precursor cleanup to reduce build warnings	2019-08-26 19:41:00 -07:00
vinoth chandar	6edf0b9def	[HUDI-68] Pom cleanup & demo automation (#846 ) - [HUDI-172] Cleanup Maven POM/Classpath - Fix ordering of dependencies in poms, to enable better resolution - Idea is to place more specific ones at the top - And place dependencies which use them below them - [HUDI-68] : Automate demo steps on docker setup - Move hive queries from hive cli to beeline - Standardize on taking query input from text command files - Deltastreamer ingest, also does hive sync in a single step - Spark Incremental Query materialized as a derived Hive table using datasource - Fix flakiness in HDFS spin up and output comparison - Code cleanup around streamlining and loc reduction - Also fixed pom to not shade some hive classs in spark, to enable hive sync	2019-08-22 20:18:50 -07:00
Balaji Varadarajan	a4f9d7575f	HUDI-123 Rename code packages/constants to org.apache.hudi (#830 ) - Rename com.uber.hoodie to org.apache.hudi - Flag to pass com.uber.hoodie Input formats for hoodie-sync - Works with HUDI demo. - Also tested for backwards compatibility with datasets built by com.uber.hoodie packages - Migration guide : https://cwiki.apache.org/confluence/display/HUDI/Migration+Guide+From+com.uber.hoodie+to+org.apache.hudi	2019-08-11 17:48:17 -07:00

33 Commits