1
0
Commit Graph

25 Commits

Author SHA1 Message Date
Shen Hong
e4e2fbc3bb [HUDI-1419] Add base implementation for hudi java client (#2286) 2020-12-19 19:25:27 -08:00
Sivabalan Narayanan
33d338f392 [HUDI-115] Adding DefaultHoodieRecordPayload to honor ordering with combineAndGetUpdateValue (#2311)
* Added ability to pass in `properties` to payload methods, so they can perform table/record specific merges
* Added default methods so existing payload classes are backwards compatible. 
* Adding DefaultHoodiePayload to honor ordering while merging two records
* Fixing default payload based on feedback
2020-12-19 19:19:42 -08:00
lw0090
8b5d6f9430 [HUDI-1437] support more accurate spark JobGroup for better performance tracking (#2322) 2020-12-17 15:20:13 -08:00
Balaji Varadarajan
069a1dcf24 [HUDI-1435] Fix bug in Marker File Reconciliation for Non-Partitioned datasets (#2301) 2020-12-14 22:24:12 -08:00
steven zhang
11bc1fe6f4 [HUDI-1428] Clean old fileslice is invalid (#2292)
Co-authored-by: zhang wen <wen.zhang@dmall.com>
Co-authored-by: zhang wen <steven@stevendeMac-mini.local>
2020-12-13 06:28:53 -08:00
Shen Hong
236d1b0dec [HUDI-1439] Remove scala dependency from hudi-client-common (#2306) 2020-12-11 00:36:37 -08:00
wangxianghu
de2fbeac33 [HUDI-1412] Make HoodieWriteConfig support setting different default … (#2278)
* [HUDI-1412] Make HoodieWriteConfig support setting different default value according to engine type
2020-12-07 09:29:53 +08:00
lw0090
1f0d5c077e [HUDI-1349] spark sql support overwrite use insert_overwrite_table (#2196) 2020-12-03 12:26:21 -08:00
Prashant Wason
ac23d2587f [HUDI-1357] Added a check to validate records are not lost during merges. (#2216)
- Turned off by default
2020-12-01 13:44:57 -08:00
leesf
3d5e9fee7f [MINOR] refactor code in HoodieMergeHandle (#2272) 2020-11-28 21:47:05 +08:00
Balaji Varadarajan
0ebef1c0a0 [HUDI-1358] Fix leaks in DiskBasedMap and LazyFileIterable (#2249) 2020-11-23 10:56:26 -08:00
Shen Hong
d9411c38db [HUDI-1364] Add HoodieJavaEngineContext to hudi-java-client (#2222) 2020-11-23 10:06:28 -08:00
Gary Li
c8d5ea2752 [MINOR] clean up and add comments to flink client (#2261) 2020-11-19 15:27:52 +08:00
wangxianghu
4d05680038 [HUDI-1327] Introduce base implemetation of hudi-flink-client (#2176) 2020-11-18 17:57:11 +08:00
wangxianghu
d160abb437 [HUDI-912] Refactor and relocate KeyGenerator to support more engines (#2200)
* [HUDI-912] Refactor and relocate KeyGenerator to support more engines

* Rename KeyGenerators
2020-11-02 13:12:51 -08:00
lw0090
8545ea3856 [HUDI-1118] Cleanup rollback files residing in .hoodie folder (#2205) 2020-10-25 21:04:56 -07:00
Prashant Wason
49e855c348 [HUDI-1326] Added an API to force publish metrics and flush them. (#2152)
* [HUDI-1326] Added an API to force publish metrics and flush them.

Using the added API, publish metrics after each level of the DAG completed in hudi-test-suite.

* Code cleanups

Co-authored-by: Vinoth Chandar <vinoth@apache.org>
2020-10-24 16:47:24 -07:00
lw0090
4d80e1e221 [HUDI-284] add more test for UpdateSchemaEvolution (#2127)
Unit test different schema evolution scenarios.
2020-10-19 07:38:04 -07:00
hj2016
c0472d3317 [HUDI-1184] Fix the support of hbase index partition path change (#1978)
When the hbase index is used, when the record partition is changed to another partition, the path does not change according to the value of the partition column

Co-authored-by: huangjing <huangjing@clinbrain.com>
2020-10-11 19:05:57 -07:00
dugenkui
b58daf29ba [MINOR] remove unused generics type (#2163) 2020-10-11 18:38:42 -07:00
vinoyang
eafd7bf289 [MINOR] Fix wrong javadoc and refactor some naming issues (#2156) 2020-10-09 15:09:26 -07:00
Pratyaksh Sharma
524193eb4b [HUDI-603]: DeltaStreamer can now fetch schema before every run in continuous mode (#1566)
Co-authored-by: Balaji Varadarajan <balaji.varadarajan@robinhood.com>
2020-10-06 20:34:03 -07:00
lw0090
fdae388626 [HUDI-1203] add port configuration for EmbeddedTimelineService (#2142) 2020-10-05 11:36:54 -07:00
Prashant Wason
6c610b91ef [HUDI-1305] Added an API to shutdown and remove the metrics reporter. (#2132)
This helps in removing reporter once the test has complete. Prevents log pollution from un-necessary metric logs.

- Added an API to shutdown the metrics reporter after tests.
2020-10-04 09:30:04 -07:00
Mathieu
1f7add9291 [HUDI-1089] Refactor hudi-client to support multi-engine (#1827)
- This change breaks `hudi-client` into `hudi-client-common` and `hudi-spark-client` modules 
- Simple usages of Spark using jsc.parallelize() has been redone using EngineContext#map, EngineContext#flatMap etc
- Code changes in the PR, break classes into `BaseXYZ` parent classes with no spark dependencies living in `hudi-client-common`
- Classes on `hudi-spark-client` are named `SparkXYZ` extending the parent classes with all the Spark dependencies
- To simplify/cleanup, HoodieIndex#fetchRecordLocation has been removed and its usages in tests replaced with alternatives

Co-authored-by: Vinoth Chandar <vinoth@apache.org>
2020-10-01 14:25:29 -07:00