lanyuanxiaoyao/hudi - hudi - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
Satish Kotha	492ddcbb06	[HUDI-1191] Add incremental meta client API to query partitions modified in a time window	2020-08-25 12:40:10 -07:00
Trevor	6a4dc7384c	[HUDI-1218] Introduce BulkInsertSortMode as Independent class (#2021 )	2020-08-25 19:04:13 +08:00
Prashant Wason	218d4a6836	[HUDI-1135] Make timeline server timeout settings configurable.	2020-08-24 18:09:00 -07:00
Prashant Wason	9b1f16b604	[HUDI-1136] Add back findInstantsAfterOrEquals to the HoodieTimeline class.	2020-08-24 18:08:17 -07:00
Bhavani Sudha Saktheeswaran	f7e02aa8a3	[MINOR] Update DOAP with 0.6.0 Release (#2024 )	2020-08-24 14:47:38 -07:00
Satish Kotha	ea983ff912	[HUDI-1137] Add option to configure different path selector	2020-08-24 13:26:44 -07:00
Raymond Xu	111a9753a0	[MINOR] Update README.md (#2010 ) - add maven profile to test running commands - remove -DskipITs for packaging commands	2020-08-24 09:28:29 -07:00
Mathieu	f8dcd5334e	[HUDI-1217] Improve avroToBytes method of HoodieAvroUtils (#2018 )	2020-08-24 17:33:28 +08:00
Mathieu	35b21855da	[HUDI-1150] Fix unable to parse input partition field :1 exception when using TimestampBasedKeyGenerator(#1920 )	2020-08-23 19:56:50 +08:00
Trevor	7291607ae3	[MINOR] Remove unused log code in HoodieReadClient (#2000 )	2020-08-22 21:45:50 +08:00
Shen Hong	1d09c02f1c	[HUDI-1083] Optimization in determining insert bucket location for a given key (#1868 ) - To determine insert bucket location for a given key, hudi walks through all insert buckets with O(N) cost, while this patch adds an optimization to make it O(logN).	2020-08-22 07:41:39 -04:00
liujinhui	bfdce7b082	[HUDI-1193](Upgrade http dependency version) (#1970 )	2020-08-21 20:24:04 +08:00
Raymond Xu	3a2ae16961	[HUDI-781] Introduce HoodieTestTable for test preparation (#1997 )	2020-08-21 11:46:33 +08:00
Mathieu	34c8c9e3ea	[MINOR] Move HoodieUpgradeDowngradeException to exception package (#1993 )	2020-08-20 23:12:20 +08:00
Mathieu	b883b6d268	[HUDI-1122] Introduce a kafka implementation of hoodie write commit ca… (#1886 )	2020-08-20 23:00:59 +08:00
Mathieu	bd7814dadf	[HUDI-1206] Remove unused variable in Compactor (#1994 )	2020-08-20 18:18:36 +08:00
Pratyaksh Sharma	a2312fa1b7	[HUDI-1177]: fixed TaskNotSerializableException in TimestampBasedKeyGenerator (#1987 ) Co-authored-by: Bhavani Sudha Saktheeswaran <bhavanisudhas@gmail.com>	2020-08-19 17:43:34 -07:00
Ryan Pifer	1137b0b343	Fix HBASE index MOR tables not considering record index valid	2020-08-19 14:55:59 -07:00
Bhavani Sudha Saktheeswaran	6fa371a79c	[MINOR] Fix release script for onetime uploading of gpgkeys (#1949 )	2020-08-18 21:29:52 -07:00
Bhavani Sudha Saktheeswaran	824f23bcb8	[HUDI-1197] Fix import issue that fails scala 2.12 build (#1976 )	2020-08-18 08:41:16 -07:00
Abhishek Modi	bedbb825e0	[HUDI-1025] Meter RPC calls in HoodieWrapperFileSystem (#1916 )	2020-08-18 22:42:05 +08:00
Bhavani Sudha Saktheeswaran	4226d75144	Moving to 0.6.1-SNAPSHOT on master branch.	2020-08-14 12:54:15 -07:00
Balaji Varadarajan	b8f4a30efd	Fix Integration test flakiness in HoodieJavaStreamingApp (#1967 )	2020-08-14 01:42:15 -07:00
vinoth chandar	9bde6d616c	[HUDI-1190] Introduce @PublicAPIClass and @PublicAPIMethod annotations to mark public APIs (#1965 ) - Maturity levels one of : evolving, stable, deprecated - Took a pass and marked out most of the existing public API	2020-08-13 23:28:17 -07:00
Sivabalan Narayanan	379cf0786f	[HUDI-1013] Adding Bulk Insert V2 implementation (#1834 ) - Adding ability to use native spark row writing for bulk_insert - Controlled by `ENABLE_ROW_WRITER_OPT_KEY` datasource write option - Introduced KeyGeneratorInterface in hudi-client, moved KeyGenerator back to hudi-spark - Simplified the new API additions to just two new methods : getRecordKey(row), getPartitionPath(row) - Fixed all built-in key generators with new APIs - Made the field position map lazily created upon the first call to row based apis - Implemented native row based key generators for CustomKeyGenerator - Fixed all the tests, with these new APIs Co-authored-by: Balaji Varadarajan <varadarb@uber.com> Co-authored-by: Vinoth Chandar <vinoth@apache.org>	2020-08-13 00:33:39 -07:00
Udit Mehrotra	8d04268264	[HUDI-1174] Changes for bootstrapped tables to work with presto (#1944 ) The purpose of this pull request is to implement changes required on Hudi side to get Bootstrapped tables integrated with Presto. The testing was done against presto 0.232 and following changes were identified to make it work: Annotation UseRecordReaderFromInputFormat is required on HoodieParquetInputFormat as well, because the reading for bootstrapped tables needs to happen through record reader to be able to perform the merge. On presto side, this annotation is already handled. We need to internally maintain VIRTUAL_COLUMN_NAMES because presto's internal hive version hive-apache-1.2.2 has VirutalColumn as a class, versus the one we depend on in hudi which is an enum. Dependency changes in hudi-presto-bundle to avoid runtime exceptions.	2020-08-12 17:51:31 -07:00
wenningd	8b928e9bca	[HUDI-808] Support cleaning bootstrap source data (#1870 ) Co-authored-by: Wenning Ding <wenningd@amazon.com> Co-authored-by: Balaji Varadarajan <vbalaji@apache.org>	2020-08-11 01:43:46 -07:00
Balaji Varadarajan	626f78f6f6	Revert "[HUDI-781] Introduce HoodieTestTable for test preparation (#1871 )" This reverts commit `b2e703d442`.	2020-08-10 22:13:02 -07:00
Sivabalan Narayanan	9c24151929	[HUDI-1175] Commenting out testsuite tests from Integration tests until we investigate the CI flakiness (#1945 )	2020-08-10 21:00:57 -07:00
Raymond Xu	b2e703d442	[HUDI-781] Introduce HoodieTestTable for test preparation (#1871 )	2020-08-11 09:44:03 +08:00
liujinhui	934f00b689	[HUDI-1173] fix hudi-prometheus pom dependency (#1942 )	2020-08-11 09:06:17 +08:00
Sivabalan Narayanan	858eda85d7	[HUDI-1098] Adding OptimisticConsistencyGuard to be used during FinalizeWrite (#1912 )	2020-08-09 17:51:37 -07:00
Sivabalan Narayanan	ff53e8f0b6	[HUDI-1014] Adding Upgrade and downgrade infra for smooth transitioning from list based rollback to marker based rollback (#1858 ) - This pull request adds upgrade/downgrade infra for smooth transition from list based rollback to marker based rollback* - A new property called hoodie.table.version is added to hoodie.properties file as part of this. Whenever hoodie is launched with newer table version i.e 1(or moving from pre 0.6.0 to 0.6.0), an upgrade step will be executed automatically to adhere to marker based rollback.* - This automatic upgrade step will happen just once per dataset as the hoodie.table.version will be updated in property file after upgrade is completed once* - Similarly, a command line tool for Downgrading is added if incase some user wants to downgrade hoodie from table version 1 to 0 or move from hoodie 0.6.0 to pre 0.6.0* - Added UpgradeDowngrade to assist in upgrading or downgrading hoodie table - Added Interfaces for upgrade and downgrade and concrete implementations for upgrading from 0 to 1 and downgrading from 1 to 0. - Made some changes to ListingBasedRollbackHelper to expose just rollback stats w/o performing actual rollback, which will be consumed by Upgrade infra - Reworking failure handling for upgrade/downgrade - Changed tests accordingly, added one test around left over cleanup - New tables now write table version into hoodie.properties - Clean up code naming, abstractions. Co-authored-by: Vinoth Chandar <vinoth@apache.org>	2020-08-09 15:32:43 -07:00
Udit Mehrotra	e4a2d98f79	[HUDI-426] Bootstrap datasource integration (#1702 )	2020-08-09 14:06:13 -07:00
linshan-ma	c24c528fb7	[HUDI-1156] Remove unused dependencies from HoodieDeltaStreamerWrapper Class (#1927 )	2020-08-09 17:09:28 +08:00
liujinhui	6b349b7711	[HUDI-210] Hudi Supports Prometheus Pushgateway (#1931 ) Co-authored-by: leesf <leesf@apache.org>	2020-08-09 15:29:54 +08:00
Bhavani Sudha Saktheeswaran	3c949d2ff5	[MINOR] Fix path to hudi-hive-sync-bundle jars from run_sync_tool.sh (#1937 )	2020-08-09 00:45:10 -04:00
wenningd	9fe2d2b14a	[HUDI-427] [HUDI-971] Implement CLI support for performing bootstrap (#1869 ) * [HUDI-971] Clean partitions & fileIds returned by HFileBootstrapIndex * [HUDI-427] Implement CLI support for performing bootstrap Co-authored-by: Wenning Ding <wenningd@amazon.com> Co-authored-by: Balaji Varadarajan <vbalaji@apache.org>	2020-08-08 12:37:29 -07:00
Raymond Xu	5ee676e34f	[MINOR] Move a test method to Transformations (#1934 ) - Move TestHoodieKeyLocationFetchHandle#getRecordsPerPartition to Transformations - Improve some var namings	2020-08-08 18:25:55 +08:00
cheshta2904	1072f2748a	[HUDI-1026] Removed slf4j dependency from HoodieClientTestHarness (#1928 )	2020-08-08 12:07:22 +08:00
Yungthuis	8b66524090	[MINOR] Remove unused import (#1932 ) Co-authored-by: tom_glb <goodMorning_glb@hotmail.com>	2020-08-08 12:04:31 +08:00
Gary Li	4f74a84607	[HUDI-69] Support Spark Datasource for MOR table - RDD approach (#1848 ) - This PR implements Spark Datasource for MOR table in the RDD approach. - Implemented SnapshotRelation - Implemented HudiMergeOnReadRDD - Implemented separate Iterator to handle merge and unmerge record reader. - Added TestMORDataSource to verify this feature. - Clean up test file name, add tests for mixed query type tests - We can now revert the change made in DefaultSource Co-authored-by: Vinoth Chandar <vchandar@confluent.io>	2020-08-07 00:28:14 -07:00
Udit Mehrotra	ab453f2623	[HUDI-999] [RFC-12] Parallelize fetching of source data files/partitions (#1924 )	2020-08-06 23:44:57 -07:00
Mathieu	b51646dcc7	[HUDI-1151] Fix NPE when no new data in kafka using HoodieDeltaStreamer (#1921 )	2020-08-07 00:03:20 +08:00
lw0090	51ea27d665	[HUDI-875] Abstract hudi-sync-common, and support hudi-hive-sync, hudi-dla-sync (#1810 ) - Generalize the hive-sync module for syncing to multiple metastores - Added new options for datasource - Added new command line for delta streamer Co-authored-by: Vinoth Chandar <vinoth@apache.org>	2020-08-05 21:34:55 -07:00
Prashant Wason	c21209cb58	[HUDI-1149] Added a console metrics reporter and associated unit tests.	2020-08-05 10:31:46 -07:00
Balaji Varadarajan	9bcd3221fd	[HUDI-1144] Speedup spark read queries by caching metaclient in HoodieROPathFilter (#1919 )	2020-08-05 09:19:10 -07:00
Balaji Varadarajan	7a2429f5ba	[HUDI-575] Spark Streaming with async compaction support (#1752 )	2020-08-05 07:50:15 -07:00
Balaji Varadarajan	61e027fadd	[MINOR] Adding timeout for each command execution in docker and capture output. This will help get stdout/stderr of stuck commands (#1918 )	2020-08-05 07:46:34 -07:00
Sreeram Ramji	217a84192c	[HUDI-1140] Fix Jcommander issue for --hoodie-conf in DeltaStreamer (#1898 )	2020-08-04 21:42:51 -07:00

1 2 3 4 5 ...

1115 Commits