ZhangChaoMing
291f92069e
[MINOR] Fix wrong logic for checking state condition ( #2524 )
2021-02-06 16:40:31 +08:00
n3nash
b2c47a24be
[HUDI-1589] Fix Rollback Metadata AVRO backwards incompatiblity ( #2543 )
2021-02-05 16:03:34 -08:00
Sivabalan Narayanan
b5d4a046bb
[HUDI-1571] Adding commit_show_records_info to display record sizes for commit ( #2514 )
2021-02-05 07:53:24 -05:00
hiscat
b51b3a39a8
[HUDI-1420] HoodieTableMetaClient.getMarkerFolderPath works incorrectly on windows client with hdfs server for wrong file seperator ( #2526 )
...
* Fix HUDI-1420
FIX https://issues.apache.org/jira/browse/HUDI-1420
* fix(hudi-common): fix HUDI-1420 HoodieTableMetaClient.getMarkerFolderPath works incorrectly on windows client with hdfs server for wrong file seperator
Co-authored-by: 谢波 <xiebo1@yonghui.cn >
2021-02-05 16:24:35 +08:00
Sivabalan Narayanan
4a5683d54a
[MINOR] Fixing the default value for source ordering field for payload config ( #2516 )
2021-02-04 08:43:03 -05:00
wangxianghu
647e9faf25
[HUDI-1547] CI intermittent failure: TestJsonStringToHoodieRecordMapF… ( #2521 )
2021-02-04 11:20:01 +08:00
Volodymyr Burenin
17802569fd
[HUDI-1538] Try to init class trying different signatures instead of checking its name ( #2476 )
...
* [HUDI-1538] Try to init class trying different signatures instead of checking its name.
* Removed unused imports
Co-authored-by: volodymyr.burenin <volodymyr.burenin@cloudkitchens.com >
2021-02-03 12:29:08 -08:00
Sivabalan Narayanan
eb91e5ba70
[HUDI-1523] Call mkdir(partition) only if not exists ( #2501 )
2021-02-03 09:02:37 -05:00
wangxianghu
d74d8e2084
[HUDI-1335] Introduce FlinkHoodieSimpleIndex to hudi-flink-client ( #2271 )
2021-02-03 08:59:49 +08:00
vinoyang
50ff9ab2d2
[MINOR] Rename FileSystemViewHandler to RequestHandler and corrected the class comment ( #2458 )
2021-02-02 09:15:53 -08:00
jackiehff
ec950b4cfe
[MINOR] Fix method comment typo ( #2518 )
...
Co-authored-by: 黄飞飞 <huangfeifei@mininglamp.com >
2021-02-02 19:23:29 +08:00
pengzhiwei
0d8a4d0a56
[HUDI-1550] Honor ordering field for MOR Spark datasource reader ( #2497 )
2021-02-01 21:04:27 +08:00
steven zhang
f159c0c49a
[HUDI-1519] Improve minKey/maxKey computation in HoodieHFileWriter ( #2427 )
...
Co-authored-by: zhang wen <steven@stevendeMac-mini.local >
2021-02-01 07:51:57 -05:00
jiangjiguang
5d053b495b
[MINOR] Quickstart.generateUpdates method add check ( #2505 )
2021-01-30 10:28:00 +08:00
satishkotha
9cb6cb8189
[HUDI-1266] Add unit test for validating replacecommit rollback ( #2418 )
2021-01-29 10:28:08 -08:00
satishkotha
2d2d5c83b1
[HUDI-1555] Remove isEmpty to improve clustering execution performance ( #2502 )
2021-01-29 10:27:09 -08:00
wangxianghu
23f2ef3efb
[HUDI-623] Remove UpgradePayloadFromUberToApache ( #2455 )
2021-01-28 17:48:50 -08:00
Danny Chan
bc0325f6ea
[HUDI-1522] Add a new pipeline for Flink writer ( #2430 )
...
* [HUDI-1522] Add a new pipeline for Flink writer
2021-01-28 08:53:13 +08:00
wangxianghu
7b2e658ac0
[MINOR] Add Jira URL and Mailing List ( #2404 )
2021-01-27 19:48:42 -05:00
SteNicholas
2ee1c3fb0c
[HUDI-1234] Insert new records to data files without merging for "Insert" operation. ( #2111 )
...
* Added HoodieConcatHandle to skip merging for "insert" operation when the corresponding config is set
Co-authored-by: Sivabalan Narayanan <sivabala@uber.com >
2021-01-27 13:09:51 -05:00
luokey
a54550d94f
[MINOR]Fix NPE when using HoodieFlinkStreamer with multi parallelism ( #2492 )
2021-01-27 21:00:20 +08:00
vinoth chandar
c8ee40f8ae
[MINOR] Update doap with 0.7.0 release ( #2491 )
2021-01-26 09:28:22 -08:00
Shen Hong
c4afd179c1
[HUDI-1476] Introduce unit test infra for java client ( #2478 )
2021-01-24 11:17:19 -08:00
vinoth chandar
81836f0309
Removing spring repos from pom ( #2481 )
...
- These are being deprecated
- Causes build issues when .m2 does not have this cached already
2021-01-24 07:42:52 -08:00
Raymond Xu
84df26323d
[MINOR] Use skipTests flag for skip.hudi-spark2.unit.tests property ( #2477 )
2021-01-24 21:36:41 +08:00
wangxianghu
e302c6bc12
[HUDI-1453] Fix NPE using HoodieFlinkStreamer to etl data from kafka to hudi ( #2474 )
2021-01-23 10:27:40 +08:00
wangxianghu
d3ea0f957e
[HOTFIX] Revert upgrade flink verison to 1.12.0 ( #2473 )
2021-01-22 10:55:46 -08:00
cooper
048633da1a
[MINOR] Improve code readability,remove the continue keyword ( #2459 )
2021-01-22 13:47:14 +08:00
wangxianghu
748dcc9aae
[MINOR] Remove InstantGeneratorOperator parallelism limit in HoodieFlinkStreamer and update docs ( #2471 )
2021-01-22 13:46:25 +08:00
Xiang Yang
641abe8ab7
[HUDI-1332] Introduce FlinkHoodieBloomIndex to hudi-flink-client ( #2375 )
...
* [HUDI] Add bloom index for hudi-flink-client
Co-authored-by: yangxiang <yangxiang@oppo.com >
2021-01-22 10:36:28 +08:00
luokey
b64d22e047
[HUDI-1511] InstantGenerateOperator support multiple parallelism ( #2434 )
2021-01-22 09:17:50 +08:00
wenningd
976420c49a
[HUDI-1512] Fix spark 2 unit tests failure with Spark 3 ( #2412 )
...
* [HUDI-1512] Fix spark 2 unit tests failure with Spark 3
* resolve comments
Co-authored-by: Wenning Ding <wenningd@amazon.com >
2021-01-21 07:04:28 -08:00
vinoth chandar
81ccb0c71a
[MINOR] Make a separate travis CI job for hudi-utilities ( #2469 )
2021-01-20 21:46:05 -08:00
vinoth chandar
5e30fc1b2b
[MINOR] Disabling problematic tests temporarily to stabilize CI ( #2468 )
2021-01-20 14:24:34 -08:00
Vinoth Chandar
3719e7b388
Moving to 0.8.0-SNAPSHOT on master branch.
2021-01-20 11:31:22 -08:00
liujinhui
244f6def9c
[MINOR] Fix dataSource cannot use hoodie.datasource.hive_sync.auto_create_database ( #2444 )
...
fix dataSource cannot use hoodie.datasource.hive_sync.auto_create_database
2021-01-20 22:58:18 +08:00
teeyog
c931dc5406
[MINOR] Remove redundant judgments ( #2466 )
2021-01-20 20:41:09 +08:00
vinoth chandar
5ca0625b27
[HUDI 1308] Harden RFC-15 Implementation based on production testing ( #2441 )
...
Addresses leaks, perf degradation observed during testing. These were regressions from the original rfc-15 PoC implementation.
* Pass a single instance of HoodieTableMetadata everywhere
* Fix tests and add config for enabling metrics
- Removed special casing of assumeDatePartitioning inside FSUtils#getAllPartitionPaths()
- Consequently, IOException is never thrown and many files had to be adjusted
- More diligent handling of open file handles in metadata table
- Added config for controlling reuse of connections
- Added config for turning off fallback to listing, so we can see tests fail
- Changed all ipf listing code to cache/amortize the open/close for better performance
- Timelineserver also reuses connections, for better performance
- Without timelineserver, when metadata table is opened from executors, reuse is not allowed
- HoodieMetadataConfig passed into HoodieTableMetadata#create as argument.
- Fix TestHoodieBackedTableMetadata#testSync
2021-01-19 21:20:28 -08:00
Sivabalan Narayanan
e23967b9e9
[HUDI-1540] Fixing commons codec shading in spark bundle ( #2460 )
2021-01-20 00:00:13 -05:00
Sivabalan Narayanan
91b9cb53d3
[MINOR] Fixing setting defaults for index config ( #2457 )
2021-01-19 18:16:25 -05:00
Sivabalan Narayanan
b9c2856d16
[HUDI-1535] Fix 0.7.0 snapshot ( #2456 )
...
* Revert "[MINOR] Bumping snapshot version to 0.7.0 (#2435 )"
This reverts commit a43e191d6c .
* Fixing 0.7.0 snapshot bump
2021-01-19 12:20:43 -08:00
Volodymyr Burenin
a38612b10f
[HUDI-1532] Fixed suboptimal implementation of a magic sequence search ( #2440 )
...
* Fixed suboptimal implementation of a magic sequence search on GCS.
* Fix comparison.
* Added buffered reader around plugged storage plugin such as GCS.
* 1. Corrected some comments 2. Refactored GCS input stream check
Co-authored-by: volodymyr.burenin <volodymyr.burenin@cloudkitchens.com >
Co-authored-by: Nishith Agarwal <nagarwal@uber.com >
2021-01-18 23:07:27 -08:00
Udit Mehrotra
684e12e9fc
[HUDI-1529] Add block size to the FileStatus objects returned from metadata table to avoid too many file splits ( #2451 )
2021-01-18 07:29:53 -08:00
satishkotha
3d1d5d00b0
[HUDI-1533] Make SerializableSchema work for large schemas and add ability to sortBy numeric values ( #2453 )
2021-01-17 12:36:55 -08:00
Sivabalan Narayanan
a43e191d6c
[MINOR] Bumping snapshot version to 0.7.0 ( #2435 )
2021-01-16 09:56:28 -05:00
n3nash
749f657856
[HUDI-1509]: Reverting LinkedHashSet changes to combine fields from oldSchema and newSchema in favor of using only new schema for record rewriting ( #2424 )
2021-01-14 12:47:50 -08:00
n3nash
e926c1a45c
HUDI-1525 fix test hbase index ( #2436 )
2021-01-12 23:30:21 -08:00
Sivabalan Narayanan
e3d3677b7e
[HUDI-1502] MOR rollback and restore support for metadata sync ( #2421 )
...
- Adds field to RollbackMetadata that capture the logs written for rollback blocks
- Adds field to RollbackMetadata that capture new logs files written by unsynced deltacommits
Co-authored-by: Vinoth Chandar <vinoth@apache.org >
2021-01-11 13:23:13 -08:00
lw0090
de42adc230
[HUDI-1520] add configure for spark sql overwrite use INSERT_OVERWRITE_TABLE ( #2428 )
2021-01-11 09:07:47 -08:00
Udit Mehrotra
7ce3ac778e
[HUDI-1479] Use HoodieEngineContext to parallelize fetching of partiton paths ( #2417 )
...
* [HUDI-1479] Use HoodieEngineContext to parallelize fetching of partition paths
* Adding testClass for FileSystemBackedTableMetadata
Co-authored-by: Nishith Agarwal <nagarwal@uber.com >
2021-01-10 21:19:52 -08:00