lanyuanxiaoyao/hudi - hudi - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
leesf	26684f5984	[HUDI-816] Fixed MAX_MEMORY_FOR_MERGE_PROP and MAX_MEMORY_FOR_COMPACTION_PROP do not work due to HUDI-678 (#1536 )	2020-04-22 16:33:18 +08:00
Raymond Xu	6e15eebd81	[HUDI-809] Migrate CommonTestHarness to JUnit 5 (#1530 )	2020-04-22 14:10:25 +08:00
Alexander Filipchik	2a56f82908	[HUDI-821] Fixing JCommander param parsing in deltastreamer (#1525 ) Co-authored-by: Alex Filipchik <alex.filipchik@csscompany.com>	2020-04-21 20:12:34 -07:00
Prashant Wason	62bd3e7ded	[HUDI-757] Added hudi-cli command to export metadata of Instants. Example: hudi:db.table-> export instants --localFolder /tmp/ --limit 5 --actions clean,rollback,commit --desc false	2020-04-21 12:41:19 -07:00
hongdd	84dd9047d3	[HUDI-789]Adjust logic of upsert in HDFSParquetImporter (#1511 )	2020-04-21 14:21:30 +08:00
n3nash	332072bc6d	[HUDI-371] Supporting hive combine input format for realtime tables (#1503 )	2020-04-20 20:40:06 -07:00
Mathieu	2a2f31d919	[MINOR] Remove reduntant code and fix typo in HoodieDefaultTimeline (#1535 )	2020-04-21 09:40:22 +08:00
Dongwook	ddd105bb31	[HUDI-772] Make UserDefinedBulkInsertPartitioner configurable for DataSource (#1500 )	2020-04-20 08:38:18 -07:00
lw0090	09fd6f64c5	[HUDI-800] Fix Metrics getReporter().close() throws NPE. (#1529 )	2020-04-19 21:33:07 +08:00
baobaoyeye	75523657a4	[MINOR] use Option and fix description in toString method (#1527 ) * [MINOR] fix some places are not elegant, as a newcomer * [MINOR] fix some places are not elegant, as a newcomer	2020-04-18 12:51:37 +08:00
Alexander Filipchik	acb1ada2f7	[HUDI-799] Use appropriate FS when loading configs (#1517 ) Co-authored-by: Alex Filipchik <alex.filipchik@csscompany.com>	2020-04-16 13:49:39 -07:00
Raymond Xu	acdc4a8d00	[HUDI-798] Migrate to Mockito Jupiter for JUnit 5 (#1521 )	2020-04-16 16:07:32 +08:00
Prashant Wason	19d29ac7d0	[HUDI-741] Added checks to validate Hoodie's schema evolution. HUDI specific validation of schema evolution should ensure that a newer schema can be used for the dataset by checking that the data written using the old schema can be read using the new schema. Code changes: 1. Added a new config in HoodieWriteConfig to enable schema validation check (disabled by default) 2. Moved code that reads schema from base/log files into hudi-common from hudi-hive-sync 3. Added writerSchema to the extraMetadata of compaction commits in MOR table. This is same as that for commits on COW table. Testing changes: 4. Extended TestHoodieClientBase to add insertBatch API which allows inserting a new batch of unique records into a HUDI table 5. Added a unit test to verify schema evolution for both COW and MOR tables. 6. Added unit tests for schema compatiblity checks.	2020-04-15 23:34:59 -07:00
Iftach Schonbaum	9ca710cb02	[HUDI-777] Updated description for --target-table parameter (#1519 )	2020-04-15 14:56:13 -07:00
Raymond Xu	d65efe659d	[HUDI-780] Migrate test cases to Junit 5 (#1504 )	2020-04-15 12:35:01 -07:00
Gary Li	14d4fea833	[HUDI-759] Integrate checkpoint provider with delta streamer (#1486 )	2020-04-14 14:51:04 -07:00
hongdd	644c1cc8bd	[HUDI-698]Add unit test for CleansCommand (#1449 )	2020-04-14 17:54:47 +08:00
vinoth chandar	661b0b3bab	[HUDI-761] Refactoring rollback and restore actions using the ActionExecutor abstraction (#1492 ) - rollback() and restore() table level APIs introduced - Restore is implemented by wrapping calls to rollback executor - Existing tests transparently cover this, since its just a refactor	2020-04-13 08:29:19 -07:00
Balaji Varadarajan	17bf930342	[HUDI-770] Organize upsert/insert API implementation under a single package (#1495 )	2020-04-12 23:11:00 -07:00
Sivabalan Narayanan	447ba3bae6	[MINOR] Disabling flaky test in InlineFileSystem (#1510 )	2020-04-12 19:38:56 -07:00
Pratyaksh Sharma	6d7ca2cf7e	[HUDI-727]: Copy default values of fields if not present when rewriting incoming record with new schema (#1427 )	2020-04-12 17:55:26 -07:00
Shen Hong	5d717a28f4	[HUDI-782] Add support of Aliyun object storage service. (#1506 )	2020-04-12 10:06:30 +08:00
hongdd	a464a2972e	[HUDI-700]Add unit test for FileSystemViewCommand (#1490 )	2020-04-11 10:12:21 +08:00
satishkotha	c0f96e0726	[HUDI-687] Stop incremental reader on RO table when there is a pending compaction (#1396 )	2020-04-10 10:45:41 -07:00
Bhavani Sudha Saktheeswaran	8c7cef3e50	[HUDI - 738] Add validation to DeltaStreamer to fail fast when filterDupes is enabled on UPSERT mode. (#1505 ) Summary: This fix ensures for UPSERT operation, '--filter-dupes' is disabled and fails fast if not. Otherwise it would drop all updates silently and only take in new records.	2020-04-10 08:58:55 -07:00
Ramachandran Madtas Subramaniam	f5f34bb1c1	[HUDI-568] Improve unit test coverage Classes improved: * HoodieTableMetaClient * RocksDBDAO * HoodieRealtimeFileSplit	2020-04-09 10:15:34 -07:00
Abhishek Modi	996f761232	Trying git merge --squash	2020-04-09 08:18:02 -07:00
Satish Kotha	3c803421e0	rename variable per review comments	2020-04-08 21:56:59 -07:00
Satish Kotha	1f6be820f3	[HUDI-758] Modify Integration test to include incremental queries for MOR tables	2020-04-08 21:56:59 -07:00
Jiayi Liao	f7b55afb74	[MINOR] Fix typo in TimelineService (#1497 ) Co-authored-by: Jiayi Liao <bupt_ljy@163.com>	2020-04-08 18:14:50 -07:00
hongdd	4e5c8671ef	[HUDI-740]Fix can not specify the sparkMaster and code clean for SparkUtil (#1452 )	2020-04-08 21:33:15 +08:00
Pratyaksh Sharma	d610252d6b	[HUDI-288]: Add support for ingesting multiple kafka streams in a single DeltaStreamer deployment (#1150 ) * [HUDI-288]: Add support for ingesting multiple kafka streams in a single DeltaStreamer deployment	2020-04-07 16:10:26 -07:00
Zhiyuan Zhao	b5d093a21b	[MINOR] Clear up the redundant comment. (#1489 )	2020-04-06 16:31:54 +08:00
vinoth chandar	eaf6cc2d90	[HUDI-756] Organize Cleaning Action execution into a single package in hudi-client (#1485 ) - Introduced a thin abstraction ActionExecutor, that all actions will implement - Pulled cleaning code from table, writeclient into a single package - CleanHelper is now CleanPlanner, HoodieCleanClient is no longer around - Minor refactor of HoodieTable factory method - HoodieTable.create() methods with and without metaclient passed in - HoodieTable constructor now does not do a redundant instantiation - Fixed existing unit tests to work at the HoodieWriteClient level	2020-04-04 00:07:34 -07:00
YanJia-Gary-Li	575d87cf7d	HUDI-644 kafka connect checkpoint provider (#1453 )	2020-04-03 18:57:34 -07:00
Prashant Wason	deb95ad996	[HUDI-748] Adding .codecov.yml to set exclusions for code coverage reports. (#1468 )	2020-04-03 16:25:01 -07:00
Prashant Wason	6808559b01	[HUDI-717] Fixed usage of HiveDriver for DDL statements. (#1416 ) When using HiveDriver mode in HudiHiveClient, Hive 2.x DDL operations like ALTER PARTITION may fail. This is because Hive 2.x doesn't like `db`.`table_name` for operations. In this fix, we set the name of the database in the SessionState create for the Driver.	2020-04-03 16:23:05 -07:00
Ramachandran Madtas Subramaniam	639ec20412	[HUDI-562] Enable testing at debug log level This is to ensure that tests will execute all code paths, even the ones written under DEBUG log levels. This will improve coverage as well as ensure there are no surprised when DEBUG log level is enabled in production.	2020-04-02 11:14:35 -07:00
yanghua	bd716ece18	[MINIOR] Add license header for .asf.yaml and adjust labels	2020-04-02 16:14:35 +08:00
vinoyang	194e20e661	[MINOR] Fix label issue in .asf.yaml (#1478 )	2020-04-02 15:51:51 +08:00
Raymond Xu	5b53b0d85e	[HUDI-731] Add ChainedTransformer (#1440 ) * [HUDI-731] Add ChainedTransformer	2020-04-01 23:21:31 +08:00
Trevor	2a611f4ad3	[HUDI-749] Fix hudi-timeline-server-bundle run_server.sh start error (#1477 )	2020-04-01 22:19:54 +08:00
vinoyang	c146ca90fd	[HUDI-754] Configure .asf.yaml for Hudi Github repository (#1472 ) * [HUDI-754] Configure .asf.yaml for Hudi Github repository	2020-04-01 10:02:47 +08:00
Shaofeng Shi	78b3194e82	[HUDI-751] Fix some coding issues reported by FindBugs (#1470 )	2020-03-31 21:19:32 +08:00
Edwin Guo	9ecf0ccfb2	[HUDI-742] Fix Java Math Exception (#1466 )	2020-03-31 12:56:20 +08:00
wenningd	ce0a4c64d0	[HUDI-713] Fix conversion of Spark array of struct type to Avro schema (#1406 ) Co-authored-by: Wenning Ding <wenningd@amazon.com>	2020-03-30 15:52:15 -07:00
lamber-ken	dbc9acd23a	[HUDI-716] Exception: Not an Avro data file when running HoodieCleanClient.runClean (#1432 )	2020-03-30 11:19:17 -07:00
Prashant Wason	9f51b99174	[MINOR] Updated HoodieMergeOnReadTestUtils for future testing requirements (#1456 ) 1. getRecordsUsingInputFormat() can take a custom Configuration which can be used to specify HUDI table properties (e.g. <table>.consume.mode or <table>.consume.start.timestamp) 2. Fixed the return to return an empty List rather than raise an Exception if no records are found	2020-03-30 07:36:12 -07:00
ffcchi	1f5b0c77d6	[HUDI-724] Parallelize getSmallFiles for partitions (#1421 ) Co-authored-by: Feichi Feng <feicfeng@amazon.com>	2020-03-30 00:14:38 -07:00
Suneel Marthi	fa36082554	[HUDI-746] Reduce build warnings < 10 (#1465 )	2020-03-30 11:46:52 +08:00

1 2 3 4 5 ...

917 Commits