lanyuanxiaoyao/hudi - hudi - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
Alexey Kudinkin	85e8a5c4de	[HUDI-1296] Support Metadata Table in Spark Datasource (#4789 ) * Bootstrapping initial support for Metadata Table in Spark Datasource - Consolidated Avro/Row conversion utilities to center around Spark's AvroDeserializer ; removed duplication - Bootstrapped HoodieBaseRelation - Updated HoodieMergeOnReadRDD to be able to handle Metadata Table - Modified MOR relations to be able to read different Base File formats (Parquet, HFile)	2022-02-24 16:23:13 -05:00
Yann Byron	0c950181aa	[HUDI-3423] upgrade spark to 3.2.1 (#4815 )	2022-02-21 16:52:21 -08:00
Sagar Sumit	ed106f671e	[HUDI-2809] Introduce a checksum mechanism for validating hoodie.properties (#4712 ) Fix dependency conflict Fix repairs command Implement putIfAbsent for DDB lock provider Add upgrade step and validate while fetching configs Validate checksum for latest table version only while fetching config Move generateChecksum to BinaryUtil Rebase and resolve conflict Fix table version check	2022-02-18 10:17:06 +05:30
Yuqi Gu	e639d99387	[HUDI-1657] Fix the build on aarch64, Fedora 33 (#4617 )	2022-02-14 15:10:18 -08:00
Yann Byron	d971974063	[HUDI-3333] fix that getNestedFieldVal breaks with Spark 3.2 (#4783 )	2022-02-10 06:12:16 -08:00
Danny Chan	b3b44236fe	[HUDI-3389] Bump flink version to 1.14.3 (#4776 )	2022-02-10 11:32:01 +08:00
Sivabalan Narayanan	16138db4f2	[HUDI-3368] Revert "[HUDI-3306] Upgrade rocksdb version (#4663 )" (#4733 ) This reverts commit `6f10107998`.	2022-02-01 14:18:38 -05:00
Satyam Raj	6f10107998	[HUDI-3306] Upgrade rocksdb version (#4663 ) Co-authored-by: Satyam Raj <satyam.raj@olacabs.com>	2022-01-24 14:53:20 -05:00
leesf	5ce45c440b	[HUDI-3172] Refactor hudi existing modules to make more code reuse in V2 Implementation (#4514 ) * Introduce hudi-spark3-common and hudi-spark2-common modules to place classes that would be reused in different spark versions, also introduce hudi-spark3.1.x to support spark 3.1.x. * Introduce hudi format under hudi-spark2, hudi-spark3, hudi-spark3.1.x modules and change the hudi format in original hudi-spark module to hudi_v1 format. * Manually tested on Spark 3.1.2 and Spark 3.2.0 SQL. * Added a README.md file under hudi-spark-datasource module.	2022-01-14 13:42:35 +08:00
Sagar Sumit	12e95771ee	[HUDI-3235] Fix ClassNotFoundException due to log4j-core dependency (#4574 ) - Move log4j-core to top level pom	2022-01-12 11:53:43 -05:00
Raymond Xu	f74cd57320	[HUDI-3195] Fix spark 3 pom (#4554 ) - drop 3.0.x profile - update readme - update build CI bot.yml - fix spark 3 bundle name	2022-01-10 19:11:22 -08:00
Yann Byron	03a83ffeb5	[HUDI-3195] optimize spark3 pom and modify build command (#4538 )	2022-01-07 23:21:39 -08:00
leesf	188d0338c4	[HUDI-3134] Fix insert error after adding columns on Spark 3.2.0 (#4488 )	2022-01-01 17:38:14 -08:00
Udit Mehrotra	9412281cb1	[HUDI-2983] Remove Log4j2 transitive dependencies (#4281 )	2021-12-28 07:15:05 -08:00
Yann Byron	05942e018c	[HUDI-2811] Support Spark 3.2 (#4270 )	2021-12-28 00:12:44 -08:00
zhangyue19921010	f3f6112b75	[HUDI-3070] Add rerunFailingTestsCount for flakly testes (#4398 ) Co-authored-by: yuezhang <yuezhang@freewheel.tv>	2021-12-20 19:59:50 -08:00
wenningd	15444c951f	[HUDI-2946] Upgrade maven plugins to be compatible with higher Java versions (#4232 ) Co-authored-by: Wenning Ding <wenningd@amazon.com>	2021-12-11 20:18:39 -08:00
Y Ethan Guo	72901a33a1	[HUDI-2784] Add a hudi-trino-bundle for Trino (#4279 )	2021-12-10 14:27:22 -08:00
ForwardXu	63b15607ff	[HUDI-2937] Introduce a pulsar implementation of hoodie write commit … (#4217 ) * [HUDI-2937] Introduce a pulsar implementation of hoodie write commit callback * [HUDI-2937] Introduce a pulsar implementation of hoodie write commit callback * [HUDI-2937] Introduce a pulsar implementation of hoodie write commit callback * [HUDI-2937] Introduce a pulsar implementation of hoodie write commit callback * [HUDI-2937] Introduce a pulsar implementation of hoodie write commit callback * [HUDI-2937] Introduce a pulsar implementation of hoodie write commit callback * [HUDI-2937] Introduce a pulsar implementation of hoodie write commit callback	2021-12-05 11:51:06 +04:00
yuzhao.cyz	a1d0ff4209	Moving to 0.11.0-SNAPSHOT on master branch.	2021-11-27 17:22:10 +08:00
wenningd	1ee12cfa6f	[HUDI-2314] Add support for DynamoDb based lock provider (#3486 ) - Co-authored-by: Wenning Ding <wenningd@amazon.com> - Co-authored-by: Sivabalan Narayanan <n.siva.b@gmail.com>	2021-11-17 12:09:31 -05:00
Alexey Kudinkin	cbcbec4d38	[MINOR] Fixed checkstyle config to be based off Maven root-dir (requires Maven >=3.3.1 to work properly); (#4009 ) Updated README	2021-11-16 21:30:16 -05:00
Yann Byron	1f17467f73	[HUDI-1869] Upgrading Spark3 To 3.1 (#3844 ) Co-authored-by: pengzhiwei <pengzhiwei2015@icloud.com>	2021-11-02 18:25:12 -07:00
Sivabalan Narayanan	f9bc3e03e5	[MINOR] Adding a deprecated constructor to AbstractSyncHoodieClient (#3902 )	2021-11-02 12:16:38 -04:00
Sagar Sumit	5302b9a4ef	[HUDI-2662] Downloads from Nexus Pentaho repo taking too long (#3901 ) Co-authored-by: Sivabalan Narayanan <n.siva.b@gmail.com>	2021-11-01 19:14:48 -04:00
vinoyang	b1c4acf0ae	[HUDI-2614] Remove duplicated hadoop-hdfs with tests classifier exists in bundles (#3864 )	2021-10-26 22:36:10 +08:00
rmahindra123	3686c25fae	[HUDI-2469] [Kafka Connect] Replace json based payload with protobuf for Transaction protocol. (#3694 ) * Substitue Control Event with protobuf * Fix tests * Fix unit tests * Add javadocs * Add javadocs * Address reviewer comments Co-authored-by: Rajesh Mahindra <rmahindra@Rajeshs-MacBook-Pro.local>	2021-10-19 14:29:48 -07:00
rmahindra123	e528dd798a	[HUDI-2394] Implement Kafka Sink Protocol for Hudi for Ingesting Immutable Data (#3592 ) - Fixing packaging, naming of classes - Use of log4j over slf4j for uniformity - More follow-on fixes - Added a version to control/coordinator events. - Eliminated the config added to write config - Fixed fetching of checkpoints based on table type - Clean up of naming, code placement Co-authored-by: Rajesh Mahindra <rmahindra@Rajeshs-MacBook-Pro.local> Co-authored-by: Vinoth Chandar <vinoth@apache.org>	2021-09-10 18:20:26 -07:00
Raymond Xu	38c9b85aa8	[HUDI-2280] Use GitHub Actions to build different scala spark versions (#3556 )	2021-09-01 08:51:00 -07:00
Danny Chan	66f951322a	[HUDI-2191] Bump flink version to 1.13.1 (#3291 )	2021-08-16 18:14:05 +08:00
Udit Mehrotra	3e301196bf	Moving to 0.10.0-SNAPSHOT on master branch.	2021-08-14 18:51:09 -07:00
Sagar Sumit	5cc96e85c1	[HUDI-1897] Deltastreamer source for AWS S3 (#3433 ) - Added two sources for two stage pipeline. a. S3EventsSource that fetches events from SQS and ingests to a meta hoodie table. b. S3EventsHoodieIncrSource reads S3 events from this meta hoodie table, fetches actual objects from S3 and ingests to sink hoodie table. - Added selectors to assist in S3EventsSource. Co-authored-by: Satish M <84978833+satishmittal1111@users.noreply.github.com> Co-authored-by: Vinoth Chandar <vinoth@apache.org>	2021-08-14 08:25:10 -04:00
pengzhiwei	3f8ca1a355	[HUDI-2182] Support Compaction Command For Spark Sql (#3277 )	2021-08-06 15:12:10 +08:00
pengzhiwei	0dcd6a8fca	[HUDI-2233] Use HMS To Sync Hive Meta For Spark Sql (#3387 )	2021-08-05 09:57:22 -04:00
pengzhiwei	151f22e43a	[HUDI-2195] Sync Hive Failed When Execute CTAS In Spark2 And Spark3 (#3299 )	2021-07-22 15:33:38 +08:00
Vinay Patil	5a94b6bf54	[HUDI-2192] Clean up Multiple versions of scala libraries detected Warning (#3292 )	2021-07-21 00:33:27 -07:00
Randal Boyle	60e0254e67	[HUDI-1996] Adding functionality to allow the providing of basic auth creds for confluent cloud schema registry (#3097 ) * adding support for basic auth with confluent cloud schema registry	2021-07-05 23:40:23 -07:00
Jintao Guan	b8fe5b91d5	[HUDI-764] [HUDI-765] ORC reader writer Implementation (#2999 ) Co-authored-by: Qingyun (Teresa) Kang <kteresa@uber.com>	2021-06-15 15:21:43 -07:00
Raymond Xu	f922837064	[HUDI-1950] Fix Azure CI failure in TestParquetUtils (#2984 ) * fix azure pipeline configs * add pentaho.org in maven repositories * Make sure file paths with scheme in TestParquetUtils * add azure build status to README	2021-06-15 03:45:17 -07:00
pengzhiwei	f760ec543e	[HUDI-1659] Basic Implement Of Spark Sql Support For Hoodie (#2645 ) Main functions: Support create table for hoodie. Support CTAS. Support Insert for hoodie. Including dynamic partition and static partition insert. Support MergeInto for hoodie. Support DELETE Support UPDATE Both support spark2 & spark3 based on DataSourceV1. Main changes: Add sql parser for spark2. Add HoodieAnalysis for sql resolve and logical plan rewrite. Add commands implementation for CREATE TABLE、INSERT、MERGE INTO & CTAS. In order to push down the update&insert logical to the HoodieRecordPayload for MergeInto, I make same change to the HoodieWriteHandler and other related classes. 1、Add the inputSchema for parser the incoming record. This is because the inputSchema for MergeInto is different from writeSchema as there are some transforms in the update& insert expression. 2、Add WRITE_SCHEMA to HoodieWriteConfig to pass the write schema for merge into. 3、Pass properties to HoodieRecordPayload#getInsertValue to pass the insert expression and table schema. Verify this pull request Add TestCreateTable for test create hoodie tables and CTAS. Add TestInsertTable for test insert hoodie tables. Add TestMergeIntoTable for test merge hoodie tables. Add TestUpdateTable for test update hoodie tables. Add TestDeleteTable for test delete hoodie tables. Add TestSqlStatement for test supported ddl/dml currently.	2021-06-07 23:24:32 -07:00
vinoth chandar	d02c0e5387	[MINOR] Resolve build issue arising from inaccessible pentaho jar (#3034 ) - Fixes #160 #2479	2021-06-04 15:28:44 -04:00
Raymond Xu	3418a92de8	[HUDI-1620] Fix Metrics UT (#2894 ) Make sure shutdown Metrics between unit test cases to ensure isolation	2021-04-30 11:20:41 -07:00
Gary Li	4db970dc8a	[HOTFIX] Disable ITs for Spark3 and scala2.12 (#2733 )	2021-03-29 06:04:48 -07:00
Gary Li	452f5e2d66	[HOTFIX] close spark session in functional test suite and disable spark3 test for spark2 (#2727 )	2021-03-29 06:04:48 -07:00
Danny Chan	8b774fe331	[HUDI-1495] Bump Flink version to 1.12.2 (#2718 )	2021-03-26 14:25:57 +08:00
garyli1019	6e803e08b1	Moving to 0.9.0-SNAPSHOT on master branch.	2021-03-24 21:37:14 +08:00
Sivabalan Narayanan	55a489c769	[1568] Fixing spark3 bundles (#2625 ) - [1568] Fixing spark3 bundles	2021-03-19 14:21:36 -04:00
n3nash	74241947c1	[HUDI-845] Added locking capability to allow multiple writers (#2374 ) * [HUDI-845] Added locking capability to allow multiple writers 1. Added LockProvider API for pluggable lock methodologies 2. Added Resolution Strategy API to allow for pluggable conflict resolution 3. Added TableService client API to schedule table services 4. Added Transaction Manager for wrapping actions within transactions	2021-03-16 16:43:53 -07:00
Raymond Xu	ab9933f206	[HUDI-1620] Add azure pipelines configs (#2582 )	2021-02-23 16:52:41 -08:00
n3nash	ffcfb58bac	[HUDI-1486] Remove inline inflight rollback in hoodie writer (#2359 ) 1. Refactor rollback and move cleaning failed commits logic into cleaner 2. Introduce hoodie heartbeat to ascertain failed commits 3. Fix test cases	2021-02-19 20:12:22 -08:00

1 2 3 4 5 ...

255 Commits