lanyuanxiaoyao/hudi - hudi - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
Shawy Geng	6e24434682	[HUDI-2113] Fix integration testing failure caused by sql results out of order (#3204 )	2021-07-06 00:35:12 -07:00
wenningd	d412fb2fe6	[HUDI-89] Add configOption & refactor all configs based on that (#2833 ) Co-authored-by: Wenning Ding <wenningd@amazon.com>	2021-06-30 14:26:30 -07:00
Sivabalan Narayanan	5564c7ec01	[HUDI-2006] Adding more yaml templates to test suite (#3073 )	2021-06-29 23:05:46 -04:00
Sivabalan Narayanan	ac72470e10	[HUDI-1851] Adding test suite long running automate scripts for docker (#2880 )	2021-05-11 01:26:01 -07:00
Gary Li	050626ad6c	[MINOR] Add Missing Apache License to test files (#2736 )	2021-03-29 07:17:23 -07:00
garyli1019	6e803e08b1	Moving to 0.9.0-SNAPSHOT on master branch.	2021-03-24 21:37:14 +08:00
Sivabalan Narayanan	55a489c769	[1568] Fixing spark3 bundles (#2625 ) - [1568] Fixing spark3 bundles	2021-03-19 14:21:36 -04:00
Sivabalan Narayanan	d5f202821b	Adding fixes to test suite framework. Adding clustering node and validate async operations node. (#2400 )	2021-02-12 09:29:21 -08:00
Vinoth Chandar	3719e7b388	Moving to 0.8.0-SNAPSHOT on master branch.	2021-01-20 11:31:22 -08:00
Sivabalan Narayanan	b9c2856d16	[HUDI-1535] Fix 0.7.0 snapshot (#2456 ) * Revert "[MINOR] Bumping snapshot version to 0.7.0 (#2435)" This reverts commit `a43e191d6c`. * Fixing 0.7.0 snapshot bump	2021-01-19 12:20:43 -08:00
Sivabalan Narayanan	a43e191d6c	[MINOR] Bumping snapshot version to 0.7.0 (#2435 )	2021-01-16 09:56:28 -05:00
pengzhiwei	b83d1d3e61	[HUDI-1484] Escape the partition value in HiveSyncTool (#2363 )	2020-12-28 23:02:36 -05:00
Sivabalan Narayanan	8cf6a7223f	[HUDI-1331] Adding support for validating entire dataset and long running tests in test suite framework (#2168 ) * trigger rebuild * [HUDI-1156] Remove unused dependencies from HoodieDeltaStreamerWrapper Class (#1927) * Adding support for validating records and long running tests in test sutie framework * Adding partial validate node * Fixing spark session initiation in Validate nodes * Fixing validation * Adding hive table validation to ValidateDatasetNode * Rebasing with latest commits from master * Addressing feedback * Addressing comments Co-authored-by: lamber-ken <lamberken@163.com> Co-authored-by: linshan-ma <mabin194046@163.com>	2020-12-26 09:29:24 -08:00
Sivabalan Narayanan	a205dd10fa	[HUDI-1338] Adding Delete support to test suite framework (#2172 ) - Adding Delete support to test suite. Added DeleteNode Added support to generate delete records	2020-11-01 00:15:41 -04:00
n3nash	e109a61803	1. Fix merge on read DAG to make docker demo pass (#2092 ) 1. Fix merge on read DAG to make docker demo pass (#2092) 2. Fix repeat_count, rollback node	2020-10-28 22:34:26 -04:00
Prashant Wason	788d236c44	[HUDI-1303] Some improvements for the HUDI Test Suite. (#2128 ) 1. Use the DAG Node's label from the yaml as its name instead of UUID names which are not descriptive when debugging issues from logs. 2. Fix CleanNode constructor which is not correctly implemented 3. When generating upsets, allows more granualar control over the number of inserts and upserts - zero or more inserts and upserts can be specified instead of always requiring both inserts and upserts. 4. Fixed generation of records of specific size - The current code was using a class variable "shouldAddMore" which was reset to false after the first record generation causing subsequent records to be of minimum size. - In this change, we pre-calculate the extra size of the complex fields. When generating records, for complex fields we read the field size from this map. 5. Refresh the timeline of the DeltaSync service before calling readFromSource. This ensures that only the newest generated data is read and data generated in the older Dag Nodes is ignored (as their AVRO files will have an older timestamp). 6. Making --workload-generator-classname an optional parameter as most probably the default will be used	2020-10-07 08:33:51 -04:00
shenh062326	581d54097c	[HUDI-1143] Change timestamp field in HoodieTestDataGenerator from double to long	2020-09-15 20:58:29 -07:00
Abhishek Modi	53d1e55110	Test Suite should work with Docker + Unit Tests	2020-09-08 22:41:14 -07:00
Dongwook	8d19ebfd0f	[HUDI-993] Let delete API use "hoodie.delete.shuffle.parallelism" (#1703 ) For Delete API, "hoodie.delete.shuffle.parallelism" isn't used as opposed to "hoodie.upsert.shuffle.parallelism" is used for upsert, this creates the performance difference between delete by upsert API with "EmptyHoodieRecordPayload" and delete API for certain cases. This patch makes the following fixes in this regard. - Let deduplicateKeys method use "hoodie.delete.shuffle.parallelism" - Repartition inputRDD as "hoodie.delete.shuffle.parallelism" in case "hoodie.combine.before.delete=false"	2020-09-01 12:55:31 -04:00
Bhavani Sudha Saktheeswaran	4226d75144	Moving to 0.6.1-SNAPSHOT on master branch.	2020-08-14 12:54:15 -07:00
Sivabalan Narayanan	9c24151929	[HUDI-1175] Commenting out testsuite tests from Integration tests until we investigate the CI flakiness (#1945 )	2020-08-10 21:00:57 -07:00
lw0090	51ea27d665	[HUDI-875] Abstract hudi-sync-common, and support hudi-hive-sync, hudi-dla-sync (#1810 ) - Generalize the hive-sync module for syncing to multiple metastores - Added new options for datasource - Added new command line for delta streamer Co-authored-by: Vinoth Chandar <vinoth@apache.org>	2020-08-05 21:34:55 -07:00
vinoth chandar	539621bd33	[HUDI-242] Support for RFC-12/Bootstrapping of external datasets to hudi (#1876 ) - [HUDI-418] Bootstrap Index Implementation using HFile with unit-test - [HUDI-421] FileSystem View Changes to support Bootstrap with unit-tests - [HUDI-424] Implement Query Side Integration for querying tables containing bootstrap file slices - [HUDI-423] Implement upsert functionality for handling updates to these bootstrap file slices - [HUDI-421] Bootstrap Write Client with tests - [HUDI-425] Added HoodieDeltaStreamer support - [HUDI-899] Add a knob to change partition-path style while performing metadata bootstrap - [HUDI-900] Metadata Bootstrap Key Generator needs to handle complex keys correctly - [HUDI-424] Simplify Record reader implementation - [HUDI-423] Implement upsert functionality for handling updates to these bootstrap file slices - [HUDI-420] Hoodie Demo working with hive and sparkSQL. Also, Hoodie CLI working with bootstrap tables Co-authored-by: Mehrotra <uditme@amazon.com> Co-authored-by: Vinoth Chandar <vinoth@apache.org> Co-authored-by: Balaji Varadarajan <varadarb@uber.com>	2020-08-03 20:19:21 -07:00
n3nash	727f1df62c	[MINOR] Suppressing spark logs for hudi-integ and hudi-utilities (#1894 )	2020-07-31 19:01:25 -07:00
Nishith Agarwal	2fc2b01d86	[HUDI-394] Provide a basic implementation of test suite	2020-07-30 21:21:15 -07:00
hongdd	fa419213f6	[HUDI-703] Add test for HoodieSyncCommand (#1774 )	2020-07-28 08:31:43 +08:00
sathyaprakashg	df2e0c760e	HUDI-942 Increase default value number of delta commits for inline compaction (#1664 ) Co-authored-by: Sathyaprakash Govindasamy <sathyaprakashg@zillowgroup.com>	2020-06-10 16:16:44 -07:00
Vinoth Govindarajan	8cb86b4d36	Added python3 to the spark_base docker image to support pyspark (#1632 )	2020-05-31 22:53:50 -07:00
Satish Kotha	1f6be820f3	[HUDI-758] Modify Integration test to include incremental queries for MOR tables	2020-04-08 21:56:59 -07:00
lamber-ken	90227eeda7	[HUDI-673] Rename hudi-hive-bundle to hudi-hive-sync-bundle	2020-03-07 21:44:35 +08:00
lamber-ken	ccbf543607	[HUDI-654] Rename hudi-hive to hudi-hive-sync	2020-03-06 22:13:16 +08:00
yanghua	0dc8e493aa	Moving to 0.6.0-SNAPSHOT on master branch.	2020-03-01 15:08:30 +08:00
lamber-ken	323fffad0d	[HUDI-606] Improve execute build_local_docker_images.sh script	2020-02-26 19:38:19 +08:00
lamber-ken	11fb2c2614	[HUDI-580] Fix incorrect license header in files	2020-02-25 08:54:26 -08:00
lamber-ken	cdb028f1f3	[MINOR] Fix missing groupId / version property of dependency	2020-01-25 09:19:55 -08:00
leesf	fc8d4a71ad	[MINOR] fix license issue (#1273 )	2020-01-23 02:03:49 +08:00
leesf	ed54eb20a5	[MINOR] Add missing licenses (#1271 )	2020-01-22 08:06:45 -05:00
lamber-ken	a54535ed5a	[MINOR] Fix invalid maven repo address (#1265 )	2020-01-21 04:41:59 -08:00
leesf	6e59c1c777	Moving to 0.5.2-SNAPSHOT on master branch.	2020-01-20 10:51:33 -08:00
wenningd	292c1e2ff4	[HUDI-238] Make Hudi support Scala 2.12 (#1226 ) * [HUDI-238] Rename scala related artifactId & add maven profile to support Scala 2.12	2020-01-17 14:02:21 -08:00
vinoth chandar	c2c0f6b13d	[HUDI-509] Renaming code in sync with cWiki restructuring (#1212 ) - Storage Type replaced with Table Type (remaining instances) - View types replaced with query types; - ReadOptimized view referred as Snapshot Query - TableFileSystemView sub interfaces renamed to BaseFileOnly and Slice Views - HoodieDataFile renamed to HoodieBaseFile - Hive Sync tool will register RO tables for MOR with a `_ro` suffix - Datasource/Deltastreamer options renamed accordingly - Support fallback to old config values as well, so migration is painless - Config for controlling _ro suffix addition - Renaming DataFile to BaseFile across DTOs, HoodieFileSlice and AbstractTableFileSystemView	2020-01-16 23:58:47 -08:00
yuehan124	c78092d2d3	[HUDI-501] Execute docker/setup_demo.sh in any directory	2020-01-06 10:26:06 -08:00
lamber-ken	d9fbe33339	[HOTFIX] Fix error configuration item of dockerfile-maven-plugin	2019-11-19 16:30:03 -08:00
Balaji Varadarajan	f7c2f8cedc	[HUDI-329] Presto Containers for integration test must allow newly built local jars to override	2019-11-13 17:35:34 -08:00
Mehrotra	92c69f5703	Migrate integration tests to spark 2.4.4	2019-11-13 16:53:24 -08:00
Sivabalan Narayanan	23b303e4b1	[HUDI-218] Adding Presto support to Integration Test (#1003 )	2019-11-11 06:21:49 -08:00
Balaji Varadarajan	a6390aefc4	[HUDI-312] Make docker hdfs cluster ephemeral. This is needed to fix flakiness in integration tests. Also, Fix DeltaStreamer hanging issue due to uncaught exception	2019-11-01 11:49:59 -07:00
leesf	b19bed442d	[HUDI-296] Explore use of spotless to auto fix formatting errors (#945 ) - Add spotless format fixing to project - One time reformatting for conformity - Build fails for formatting changes and mvn spotless:apply autofixes them	2019-10-10 05:19:40 -07:00
Balaji Varadarajan	9b66ea41fd	[HUDI-121] Remove leftover notice file and replace com.uber.hoodie with org.apache.hudi in log4j properties	2019-10-04 09:18:57 -07:00
Balaji Varadarajan	6da2f9ac7c	[HUDI-287] Address comments during review of release candidate 1. Remove LICENSE and NOTICE files in hoodie child modules. 2. Remove developers and contributor section from pom 3. Also ensure any failures in validation script is reported appropriately 4. Make hoodie parent pom consistent with that of its parent apache-21 (https://github.com/apache/maven-apache-parent/blob/apache-21/pom.xml)	2019-10-03 09:00:07 -07:00

1 2

77 Commits