Raymond Xu
ca36c44cb3
[HUDI-995] Move TestRawTripPayload and HoodieTestDataGenerator to hudi-common ( #1873 )
2020-07-27 19:21:45 +08:00
Shen Hong
c3279cd598
[HUDI-1082] Fix minor bug in deciding the insert buckets ( #1838 )
2020-07-23 08:31:49 -04:00
Mathieu
da106803b6
[HUDI-1037] Introduce a write committed callback hook and given a default http callback implementation ( #1842 )
2020-07-23 19:07:05 +08:00
zherenyu831
c39778c150
[HUDI-1113] Add user define metrics reporter ( #1851 )
2020-07-23 13:46:36 +08:00
vinoth chandar
3dd189ec7d
[MINOR] Fix checkstyle issue on TestHoodieClientOnCopyOnWriteStorage ( #1865 )
2020-07-22 21:54:45 -07:00
vinoth chandar
a8bd76c299
[HUDI-1029] In inline compaction mode, previously failed compactions needs to be retried before new compactions ( #1857 )
...
- Prevents failed compactions from causing issues with future commits
2020-07-22 21:22:06 -07:00
vinoth chandar
9bd37ef291
[MINOR] Fix flaky testUpsertsUpdatePartitionPath* tests ( #1863 )
2020-07-22 22:52:34 -04:00
Sivabalan Narayanan
5b6026ba43
[HUDI-802] Fixing deletes for inserts in same batch in write path ( #1792 )
...
* Fixing deletes for inserts in same batch in write path
* Fixing delta streamer tests
* Adding tests for OverwriteWithLatestAvroPayload
2020-07-22 19:39:57 -07:00
Raymond Xu
5e7ab11e2e
[HUDI-994] Move TestHoodieIndex test cases to unit tests ( #1850 )
2020-07-21 10:23:43 -07:00
lw0090
1ec89e9a94
[HUDI-839] Introducing support for rollbacks using marker files ( #1756 )
...
* [HUDI-839] Introducing rollback strategy using marker files
- Adds a new mechanism for rollbacks where it's based on the marker files generated during the write
- Consequently, marker file/dir deletion now happens post commit, instead of during finalize
- Marker files are also generated for AppendHandle, making it consistent throughout the write path
- Until upgrade-downgrade mechanism can upgrade non-marker based inflight writes to marker based, this should only be turned on for new datasets.
- Added marker dir deletion after successful commit/rollback, individual files are not deleted during finalize
- Fail safe for deleting marker directories, now during timeline archival process
- Added check to ensure completed instants are not rolled back using marker based strategy. This will be incorrect
- Reworked tests to rollback inflight instants, instead of completed instants whenever necessary
- Added an unit test for MarkerBasedRollbackStrategy
Co-authored-by: Vinoth Chandar <vinoth@apache.org >
2020-07-20 22:41:42 -07:00
Prashant Wason
b71f25f210
[HUDI-92] Provide reasonable names for Spark DAG stages in HUDI. ( #1289 )
2020-07-19 10:29:25 -07:00
Raymond Xu
b399b4ad43
[HUDI-996] Add functional test in hudi-client ( #1824 )
...
- Add functional test suite in hudi-client
- Tag TestHBaseIndex as functional
2020-07-15 08:28:50 +08:00
Raymond Xu
f5dc8ca733
[HUDI-994] Split TestHBaseIndex to unit tests ( #1818 )
...
- Refactor and improve TestHBaseIndex for performance
- Move HBaseIndex unit tests to different test classes
2020-07-13 20:32:01 -07:00
Sivabalan Narayanan
21bb1b505a
[HUDI-1068] Fixing deletes in global bloom when update partition path is set ( #1793 )
2020-07-13 22:34:07 -04:00
Raymond Xu
20ac7c3337
[HUDI-994] Make TestHBaseQPSResourceAllocator a unit test ( #1820 )
2020-07-11 09:15:05 -07:00
Raymond Xu
7b2a947aed
[HUDI-1069] Remove duplicate assertNoWriteErrors() ( #1797 )
2020-07-08 13:58:15 +08:00
Shen Hong
be85a6c32b
[HUDI-1004] Support update metrics in HoodieDeltaStreamerMetrics ( #1732 )
2020-07-06 09:44:02 -07:00
Raymond Xu
3b9a30528b
[HUDI-996] Add functional test suite for hudi-utilities ( #1746 )
...
- Share resources for functional tests
- Add suite for functional test classes from hudi-utilities
2020-07-05 16:44:31 -07:00
baobaoyeye
2be924fd3a
[HUDI-760]Remove Rolling Stat management from Hudi Writer ( #1739 )
2020-06-30 20:07:09 -07:00
Balaji Varadarajan
8919be6a5d
[HUDI-855] Run Cleaner async with writing ( #1577 )
...
- Cleaner can now run concurrently with write operation
- Configs to turn on/off
Co-authored-by: Vinoth Chandar <vinoth@apache.org >
2020-06-28 02:04:50 -07:00
Raymond Xu
31247e9b34
[HUDI-896] Report test coverage by modules & parallelize CI ( #1753 )
...
- use codecov flags for each module to report coverage
- parallelize CI jobs for shorter time
- add a testcase for MetricsReporterFactory (to trigger codecov comment)
2020-06-27 23:16:12 -07:00
Prashant Wason
2603cfb33e
[HUDI-684] Introduced abstraction for writing and reading different types of base file formats. ( #1687 )
...
Notable changes:
1. HoodieFileWriter and HoodieFileReader abstractions for writer/reader side of a base file format
2. HoodieDataBlock abstraction for creation specific data blocks for base file formats. (e.g. Parquet has HoodieAvroDataBlock)
3. All hardocded references to Parquet / Parquet based classes have been abstracted to call methods which accept a base file format
4. HiveSyncTool accepts the base file format as a CLI parameter
5. HoodieDeltaStreamer accepts the base file format as a CLI parameter
6. HoodieSparkSqlWriter accepts the base file format as a parameter
2020-06-25 23:46:55 -07:00
wangxianghu
5e47673341
[HUDI-1035] Remove unused class KeyLookupResult ( #1754 )
2020-06-23 17:01:03 -07:00
Shen Hong
89e37d5273
[HUDI-908] Add some data types to HoodieTestDataGenerator and fix some some bugs. ( #1690 )
2020-06-22 08:13:28 -07:00
wangxianghu
68a656b016
[HUDI-1032] Remove unused code in HoodieCopyOnWriteTable and code clean ( #1750 )
2020-06-21 07:34:47 -07:00
Raymond Xu
8a9fdd603e
[HUDI-1023] Add validation error messages in delta sync ( #1710 )
...
- Remove explicitly specifying BLOOM_INDEX since thats the default anyway
2020-06-19 12:12:35 -07:00
Satish Kotha
a7fd331624
Add unit test for snapshot reads in hadoop-mr
2020-06-13 10:23:05 -07:00
sathyaprakashg
df2e0c760e
HUDI-942 Increase default value number of delta commits for inline compaction ( #1664 )
...
Co-authored-by: Sathyaprakash Govindasamy <sathyaprakashg@zillowgroup.com >
2020-06-10 16:16:44 -07:00
Gary Li
37838cea60
[HUDI-822] decouple Hudi related logics from HoodieInputFormat ( #1592 )
...
- Refactoring business logic out of InputFormat into Utils helpers.
2020-06-09 06:10:16 -07:00
shenhong
3387b3841f
[HUDI-1005] fix NPE in HoodieWriteClient.clean
2020-06-09 05:57:04 -07:00
Shen Hong
6318e943d1
[HUDI-1016] Code optimization in MergeOnReadRollbackActionExecutor( #1718 )
2020-06-09 19:14:26 +08:00
garyli1019
22cd824d99
HUDI-494 fix incorrect record size estimation
2020-06-08 20:29:29 -07:00
garyli1019
e9cab67b80
[HUDI-988] Fix More Unit Test Flakiness
2020-06-07 23:14:46 -07:00
Balaji Varadarajan
fb283934a3
[HUDI-990] Timeline API : filterCompletedAndCompactionInstants needs to handle requested state correctly. Also ensure timeline gets reloaded after we revert committed transactions
2020-06-04 02:52:21 -07:00
Balaji Varadarajan
a68180b179
[HUDI-988] Fix Unit Test Flakiness : Ensure all instantiations of HoodieWriteClient is closed properly. Fix bug in TestRollbacks. Make CLI unit tests for Hudi CLI check skip redering strings
2020-06-04 02:52:21 -07:00
Raymond Xu
742c204099
[HUDI-811] Restructure test packages in hudi-client/cli ( #1689 )
2020-06-02 10:25:42 +08:00
dengziming
bde7a7043e
[HUDI-476]: Add hudi-examples module ( #1151 )
...
add hoodie delta streamer mock source example and dfs source and kafka source examples
Signed-off-by: dengziming <dengziming1993@gmail.com >
add defaultSparkConf utils method
change version of hudi-examples to 0.5.2-SNAPSHOT
change the artifcatId of hudi-spark and hudi-utilities
alter some code to adapt kafka2.0
Update scritps
Add license
2020-05-28 01:44:39 +08:00
Raymond Xu
03f136361a
[HUDI-811] Restructure test packages in hudi-common ( #1644 )
...
* [HUDI-811] Restructure test packages in hudi-common
2020-05-27 16:28:17 +08:00
sathyaprakashg
d3edac4612
HUDI-921 Remove inlineCompactionEvery method in HoodieCompactionConfig.Builder ( #1654 )
...
Co-authored-by: Sathyaprakash Govindasamy <sathyaprakashg@zillowgroup.com >
2020-05-24 01:09:18 -07:00
Raymond Xu
f34de3fb27
[HUDI-836] Implement datadog metrics reporter ( #1572 )
...
- Adds support for emitting metrics to datadog
- Tests, configs..
2020-05-22 09:14:21 -07:00
Balaji Varadarajan
74ecc27e92
[HUDI-846][HUDI-848] Enable Incremental cleaning and embedded timeline-server by default ( #1634 )
2020-05-20 05:29:43 -07:00
Raymond Xu
f802d4400b
[MINOR] Fix resource cleanup in TestTableSchemaEvolution ( #1640 )
...
- Remove Xms it is not needed.
- extending process exit timeout from 30 to 120 sec should be safe to do
2020-05-20 05:07:30 -07:00
Balaji Varadarajan
e6f3bf10cf
[HUDI-858] Allow multiple operations to be executed within a single commit ( #1633 )
2020-05-18 19:27:24 -07:00
Sivabalan Narayanan
29edf4b3b8
[HUDI-407] Adding Simple Index to Hoodie. ( #1402 )
...
This index finds the location by joining incoming records with records from base files.
2020-05-17 18:32:24 -07:00
Balaji Varadarajan
3c9da2e5f0
[HUDI-895] Remove unnecessary listing .hoodie folder when using timeline server ( #1636 )
2020-05-17 18:18:53 -07:00
Mathieu
25a0080b2f
[HUDI-714]Add javadoc and comments to hudi write method link ( #1409 )
...
* [HUDI-714] Add javadoc and comments to hudi write method link
2020-05-16 08:36:51 -04:00
Shen Hong
e8ffc6f0aa
[HUDI-881] Replace part of spark context by hadoop configuration in AbstractHoodieClient and HoodieReadClient ( #1620 )
2020-05-12 09:33:29 -07:00
Shen Hong
b54517aad0
[HUDI-886] Replace jsc.hadoopConfiguration by hadoop configuration in hudi-client testcase ( #1621 )
2020-05-12 08:51:31 -07:00
Shen Hong
295d00beea
[HUDI-880] Replace part of spark context by hadoop configuration in HoodieTable. ( #1614 )
2020-05-11 23:33:57 -07:00
Shen Hong
6dac10115c
[HUDI-870] Remove spark context in ClientUtils and HoodieIndex ( #1609 )
2020-05-11 19:05:36 +08:00