lanyuanxiaoyao/hudi - hudi - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
fengli	568181a3e7	[HUDI-2934] Optimize RequestHandler code style close apache/hudi#4215	2021-12-04 15:30:52 +08:00
yuzhao.cyz	a1d0ff4209	Moving to 0.11.0-SNAPSHOT on master branch.	2021-11-27 17:22:10 +08:00
Y Ethan Guo	d1e83e4ba0	[HUDI-2767] Enabling timeline-server-based marker as default (#4112 ) - Changes the default config of marker type (HoodieWriteConfig.MARKERS_TYPE or hoodie.write.markers.type) from DIRECT to TIMELINE_SERVER_BASED for Spark Engine. - Adds engine-specific marker type configs: Spark -> TIMELINE_SERVER_BASED, Flink -> DIRECT, Java -> DIRECT. - Uses DIRECT markers as well for Spark structured streaming due to timeline server only available for the first mini-batch. - Fixes the marker creation method for non-partitioned table in TimelineServerBasedWriteMarkers. - Adds the fallback to direct markers even when TIMELINE_SERVER_BASED is configured, in WriteMarkersFactory: when HDFS is used, or embedded timeline server is disabled, the fallback to direct markers happens. - Fixes the closing of timeline service. - Fixes tests that depend on markers, mainly by starting the timeline service for each test.	2021-11-26 16:41:05 -05:00
Sivabalan Narayanan	f692078d32	[HUDI-2671] Making error -> warn logs from timeline server with concurrent writers for inconsistent state (#4088 ) * Making error -> warn logs from timeline server with concurrent writers for inconsistent state * Fixing bad request response exception for timeline out of sync * Addressing feedback. removed write concurrency mode depedency	2021-11-25 11:21:32 -08:00
vinoyang	220bf6a7e6	[HUDI-2600] Remove duplicated hadoop-common with tests classifier exists in bundles (#3847 )	2021-10-25 13:45:28 +08:00
Y Ethan Guo	56d08fbe70	[HUDI-2351] Extract common FS and IO utils for marker mechanism (#3529 )	2021-09-09 14:45:28 -04:00
Udit Mehrotra	c350d05dd3	Restore 0.8.0 config keys with deprecated annotation (#3506 ) Co-authored-by: Sagar Sumit <sagarsumit09@gmail.com> Co-authored-by: Vinoth Chandar <vinoth@apache.org>	2021-08-19 13:36:40 -07:00
Udit Mehrotra	3e301196bf	Moving to 0.10.0-SNAPSHOT on master branch.	2021-08-14 18:51:09 -07:00
Y Ethan Guo	23dca6c237	[HUDI-2268] Add upgrade and downgrade to and from 0.9.0 (#3470 ) - Added upgrade and downgrade step to and from 0.9.0. Upgrade adds few table properties. Downgrade recreates timeline server based marker files if any.	2021-08-14 20:20:23 -04:00
Y Ethan Guo	9056c68744	[HUDI-2305] Add MARKERS.type and fix marker-based rollback (#3472 ) - Rollback infers the directory structure and does rollback based on the strategy used while markers were written. "write markers type" in write config is used to determine marker strategy only for new writes.	2021-08-14 08:18:49 -04:00
Y Ethan Guo	4783176554	[HUDI-1138] Add timeline-server-based marker file strategy for improving marker-related latency (#3233 ) - Can be enabled for cloud stores like S3. Not supported for hdfs yet, due to partial write failures.	2021-08-11 11:48:13 -04:00
rmahindra123	8fef50e237	[HUDI-2044] Integrate consumers with rocksDB and compression within External Spillable Map (#3318 )	2021-07-28 01:31:03 -04:00
wenningd	d412fb2fe6	[HUDI-89] Add configOption & refactor all configs based on that (#2833 ) Co-authored-by: Wenning Ding <wenningd@amazon.com>	2021-06-30 14:26:30 -07:00
Volodymyr Burenin	8a48d16e41	[HUDI-1707] Reduces log level for too verbose messages from info to debug level. (#2714 ) * Reduces log level for too verbose messages from info to debug level. * Sort config output. * Code Review : Small restructuring + rebasing to master - Fixing flaky multi delta streamer test - Using isDebugEnabled() checks - Some changes to shorten log message without moving to DEBUG Co-authored-by: volodymyr.burenin <volodymyr.burenin@cloudkitchens.com> Co-authored-by: Vinoth Chandar <vinoth@apache.org>	2021-05-10 07:16:02 -07:00
garyli1019	6e803e08b1	Moving to 0.9.0-SNAPSHOT on master branch.	2021-03-24 21:37:14 +08:00
Prashant Wason	f11a6c7b2d	[HUDI-1553] Configuration and metrics for the TimelineService. (#2495 )	2021-03-02 21:58:41 -08:00
vinoyang	50ff9ab2d2	[MINOR] Rename FileSystemViewHandler to RequestHandler and corrected the class comment (#2458 )	2021-02-02 09:15:53 -08:00
Vinoth Chandar	3719e7b388	Moving to 0.8.0-SNAPSHOT on master branch.	2021-01-20 11:31:22 -08:00
vinoth chandar	5ca0625b27	[HUDI 1308] Harden RFC-15 Implementation based on production testing (#2441 ) Addresses leaks, perf degradation observed during testing. These were regressions from the original rfc-15 PoC implementation. * Pass a single instance of HoodieTableMetadata everywhere * Fix tests and add config for enabling metrics - Removed special casing of assumeDatePartitioning inside FSUtils#getAllPartitionPaths() - Consequently, IOException is never thrown and many files had to be adjusted - More diligent handling of open file handles in metadata table - Added config for controlling reuse of connections - Added config for turning off fallback to listing, so we can see tests fail - Changed all ipf listing code to cache/amortize the open/close for better performance - Timelineserver also reuses connections, for better performance - Without timelineserver, when metadata table is opened from executors, reuse is not allowed - HoodieMetadataConfig passed into HoodieTableMetadata#create as argument. - Fix TestHoodieBackedTableMetadata#testSync	2021-01-19 21:20:28 -08:00
Sivabalan Narayanan	a43e191d6c	[MINOR] Bumping snapshot version to 0.7.0 (#2435 )	2021-01-16 09:56:28 -05:00
vinoth chandar	65866c45ec	[HUDI-1276] [HUDI-1459] Make Clustering/ReplaceCommit and Metadata table be compatible (#2422 ) * [HUDI-1276] [HUDI-1459] Make Clustering/ReplaceCommit and Metadata table be compatible * Use filesystemview and json format from metadata. Add tests Co-authored-by: Satish Kotha <satishkotha@uber.com>	2021-01-09 16:53:34 -08:00
satishkotha	33ec88fc38	[HUDI-1352] Add FileSystemView APIs to query pending clustering operations (#2202 )	2020-11-05 08:49:58 -08:00
lw0090	fdae388626	[HUDI-1203] add port configuration for EmbeddedTimelineService (#2142 )	2020-10-05 11:36:54 -07:00
satishkotha	a99e93bed5	[HUDI-1072] Introduce REPLACE top level action. Implement insert_overwrite operation on top of replace action (#2048 )	2020-09-29 17:04:25 -07:00
Bhavani Sudha Saktheeswaran	4226d75144	Moving to 0.6.1-SNAPSHOT on master branch.	2020-08-14 12:54:15 -07:00
Raymond Xu	31247e9b34	[HUDI-896] Report test coverage by modules & parallelize CI (#1753 ) - use codecov flags for each module to report coverage - parallelize CI jobs for shorter time - add a testcase for MetricsReporterFactory (to trigger codecov comment)	2020-06-27 23:16:12 -07:00
garyli1019	e9cab67b80	[HUDI-988] Fix More Unit Test Flakiness	2020-06-07 23:14:46 -07:00
cxzl25	3574a89232	[HUDI-973] RemoteHoodieTableFileSystemView supports non-partitioned table queries (#1674 )	2020-05-28 10:50:47 -07:00
Raymond Xu	0d4848b68b	[HUDI-811] Restructure test packages (#1607 ) * restructure hudi-spark tests * restructure hudi-timeline-service tests * restructure hudi-hadoop-mr hudi-utilities tests * restructure hudi-hive-sync tests	2020-05-13 15:37:03 -07:00
Raymond Xu	acdc4a8d00	[HUDI-798] Migrate to Mockito Jupiter for JUnit 5 (#1521 )	2020-04-16 16:07:32 +08:00
Raymond Xu	d65efe659d	[HUDI-780] Migrate test cases to Junit 5 (#1504 )	2020-04-15 12:35:01 -07:00
Jiayi Liao	f7b55afb74	[MINOR] Fix typo in TimelineService (#1497 ) Co-authored-by: Jiayi Liao <bupt_ljy@163.com>	2020-04-08 18:14:50 -07:00
Ramachandran Madtas Subramaniam	639ec20412	[HUDI-562] Enable testing at debug log level This is to ensure that tests will execute all code paths, even the ones written under DEBUG log levels. This will improve coverage as well as ensure there are no surprised when DEBUG log level is enabled in production.	2020-04-02 11:14:35 -07:00
Suneel Marthi	fa36082554	[HUDI-746] Reduce build warnings < 10 (#1465 )	2020-03-30 11:46:52 +08:00
vinoth chandar	e057c27603	[HUDI-744] Restructure hudi-common and clean up files under util packages (#1462 ) - Brings more order and cohesion to the classes in hudi-common - Utils classes related to a particular concept (avro, timeline,...) are placed near to the package - common.fs package now contains all the filesystem level classes including wrapper filesystem - bloom.filter package renamed to just bloom - config package contains classes that help store properties - common.fs.inline package contains all the inline filesystem classes/impl - common.table.timeline now consolidates all timeline related classes - common.table.view consolidates all the classes related to filesystem view metadata - common.table.timeline.versioning contains all classes related to versioning of timeline - Fix few unit tests as a result - Moved the test packages around to match the source file move - Rename AvroUtils to TimelineMetadataUtils & minor fixes/typos	2020-03-29 10:58:49 -07:00
Suneel Marthi	99b7e9eb9e	[HUDI-629]: Replace Guava's Hashing with an equivalent in NumericUtils.java (#1350 ) * [HUDI-629]: Replace Guava's Hashing with an equivalent in NumericUtils.java	2020-03-13 20:28:05 -04:00
yanghua	0dc8e493aa	Moving to 0.6.0-SNAPSHOT on master branch.	2020-03-01 15:08:30 +08:00
Ramachandran M S	acf359c834	[HUDI-627] Aggregate code coverage and publish to codecov.io during CI (#1347 )	2020-02-27 13:54:20 -08:00
Suneel Marthi	5b7bb142dc	[HUDI-583] Code Cleanup, remove redundant code, and other changes (#1237 )	2020-02-02 18:03:44 +08:00
leesf	6e59c1c777	Moving to 0.5.2-SNAPSHOT on master branch.	2020-01-20 10:51:33 -08:00
vinoth chandar	c2c0f6b13d	[HUDI-509] Renaming code in sync with cWiki restructuring (#1212 ) - Storage Type replaced with Table Type (remaining instances) - View types replaced with query types; - ReadOptimized view referred as Snapshot Query - TableFileSystemView sub interfaces renamed to BaseFileOnly and Slice Views - HoodieDataFile renamed to HoodieBaseFile - Hive Sync tool will register RO tables for MOR with a `_ro` suffix - Datasource/Deltastreamer options renamed accordingly - Support fallback to old config values as well, so migration is painless - Config for controlling _ro suffix addition - Renaming DataFile to BaseFile across DTOs, HoodieFileSlice and AbstractTableFileSystemView	2020-01-16 23:58:47 -08:00
lamber-ken	d9675c4ec0	[HUDI-522] Use the same version jcommander uniformly (#1214 )	2020-01-12 10:48:52 -08:00
vinoth chandar	9706f659db	[HUDI-508] Standardizing on "Table" instead of "Dataset" across code (#1197 ) - Docs were talking about storage types before, cWiki moved to "Table" - Most of code already has HoodieTable, HoodieTableMetaClient - correct naming - Replacing renaming use of dataset across code/comments - Few usages in comments and use of Spark SQL DataSet remain unscathed	2020-01-07 12:52:32 -08:00
SteNicholas	def18a5086	[MINOR] optimize hudi timeline service (#1137 )	2019-12-25 14:40:25 -08:00
lamber-ken	d447e2d751	[checkstyle] Unify LOG form (#1092 )	2019-12-10 19:23:38 +08:00
lamber-ken	2745b7552f	[HUDI-379] Refactor the codes based on new JavadocStyle code style rule (#1079 )	2019-12-06 12:59:28 +08:00
lamber-ken	b3e0ebbc4a	[checkstyle] Add ConstantName java checkstyle rule (#1066 ) * add SimplifyBooleanExpression java checkstyle rule * collapse empty tags in scalastyle file	2019-12-04 18:59:15 +08:00
谢磊	f9139c0f61	[HUDI-366] Refactor some module codes based on new ImportOrder code style rule (#1055 ) [HUDI-366] Refactor hudi-hadoop-mr / hudi-timeline-service / hudi-spark / hudi-integ-test / hudi- utilities based on new ImportOrder code style rule	2019-11-27 21:32:43 +08:00
Balaji Varadarajan	1032fc3e54	[HUDI-137] Hudi cleaning state changes should be consistent with compaction actions Before this change, Cleaner performs cleaning of old file versions and then stores the deleted files in .clean files. With this setup, we will not be able to track file deletions if a cleaner fails after deleting files but before writing .clean metadata. This is fine for regular file-system view generation but Incremental timeline syncing relies on clean/commit/compaction metadata to keep a consistent file-system view. Cleaner state transitions is now similar to that of compaction. 1. Requested : HoodieWriteClient.scheduleClean() selects the list of files that needs to be deleted and stores them in metadata 2. Inflight : HoodieWriteClient marks the state to be inflight before it starts deleting 3. Completed : HoodieWriteClient marks the state after completing the deletion according to the cleaner plan	2019-11-11 10:40:16 -08:00
vinoth chandar	e4c91ed13f	[HUDI-290] Normalize test class name of all test classes (#951 )	2019-10-22 20:19:11 -07:00

1 2

64 Commits