1
0
Commit Graph

11 Commits

Author SHA1 Message Date
Alexey Kudinkin
6f5d8d04cd [HUDI-2840] Fixed DeltaStreaemer to properly respect configuration passed t/h properties file (#4090)
* Rebased `DFSPropertiesConfiguration` to access Hadoop config in liue of FS to avoid confusion

* Fixed `readConfig` to take Hadoop's `Configuration` instead of FS;
Fixing usages

* Added test for local FS access

* Rebase to use `FSUtils.getFs`

* Combine properties provided as a file along w/ overrides provided from the CLI

* Added helper utilities to `HoodieClusteringConfig`;
Make sure corresponding config methods fallback to defaults;

* Fixed DeltaStreamer usage to respect properly combined configuration;
Abstracted `HoodieClusteringConfig.from` convenience utility to init Clustering config from `Properties`

* Tidying up

* `lint`

* Reverting changes to `HoodieWriteConfig`

* Tdiying up

* Fixed incorrect merge of the props

* Converted `HoodieConfig` to wrap around `Properties` into `TypedProperties`

* Fixed compilation

* Fixed compilation
2021-11-25 14:48:22 -08:00
wenningd
24def0b30d [HUDI-2362] Add external config file support (#3416)
Co-authored-by: Wenning Ding <wenningd@amazon.com>
2021-11-18 01:59:26 -08:00
jsbali
66207ed91a [HUDI-1848] Adding support for HMS for running DDL queries in hive-sy… (#2879)
* [HUDI-1848] Adding support for HMS for running DDL queries in hive-sync-tool

* [HUDI-1848] Fixing test cases

* [HUDI-1848] CR changes

* [HUDI-1848] Fix checkstyle violations

* [HUDI-1848] Fixed a bug when metastore api fails for complex schemas with multiple levels.

* [HUDI-1848] Adding the complex schema and resolving merge conflicts

* [HUDI-1848] Adding some more javadocs

* [HUDI-1848] Added javadocs for DDLExecutor impls

* [HUDI-1848] Fixed style issue
2021-07-23 09:03:15 -07:00
Sivabalan Narayanan
379cf0786f [HUDI-1013] Adding Bulk Insert V2 implementation (#1834)
- Adding ability to use native spark row writing for bulk_insert
 - Controlled by `ENABLE_ROW_WRITER_OPT_KEY` datasource write option
 - Introduced KeyGeneratorInterface in hudi-client, moved KeyGenerator back to hudi-spark
 - Simplified the new API additions to just two new methods : getRecordKey(row), getPartitionPath(row)
 - Fixed all built-in key generators with new APIs
 - Made the field position map lazily created upon the first call to row based apis
 - Implemented native row based key generators for CustomKeyGenerator
 - Fixed all the tests, with these new APIs

Co-authored-by: Balaji Varadarajan <varadarb@uber.com>
Co-authored-by: Vinoth Chandar <vinoth@apache.org>
2020-08-13 00:33:39 -07:00
vinoth chandar
539621bd33 [HUDI-242] Support for RFC-12/Bootstrapping of external datasets to hudi (#1876)
- [HUDI-418] Bootstrap Index Implementation using HFile with unit-test
 - [HUDI-421] FileSystem View Changes to support Bootstrap with unit-tests
 - [HUDI-424] Implement Query Side Integration for querying tables containing bootstrap file slices
 - [HUDI-423] Implement upsert functionality for handling updates to these bootstrap file slices
 - [HUDI-421] Bootstrap Write Client with tests
 - [HUDI-425] Added HoodieDeltaStreamer support
 - [HUDI-899] Add a knob to change partition-path style while performing metadata bootstrap
 - [HUDI-900] Metadata Bootstrap Key Generator needs to handle complex keys correctly
 - [HUDI-424] Simplify Record reader implementation
 - [HUDI-423] Implement upsert functionality for handling updates to these bootstrap file slices
 - [HUDI-420] Hoodie Demo working with hive and sparkSQL. Also, Hoodie CLI working with bootstrap tables

Co-authored-by: Mehrotra <uditme@amazon.com>
Co-authored-by: Vinoth Chandar <vinoth@apache.org>
Co-authored-by: Balaji Varadarajan <varadarb@uber.com>
2020-08-03 20:19:21 -07:00
Ramachandran Madtas Subramaniam
639ec20412 [HUDI-562] Enable testing at debug log level
This is to ensure that tests will execute all code paths, even the ones
written under DEBUG log levels. This will improve coverage as well as
ensure there are no surprised when DEBUG log level is enabled in
production.
2020-04-02 11:14:35 -07:00
hongdd
8306205d7a [HUDI-332]Add operation type (insert/upsert/bulkinsert/delete) to HoodieCommitMetadata (#1157)
[HUDI-332]Add operation type (insert/upsert/bulkinsert/delete) to HoodieCommitMetadata (#1157)
2020-03-03 10:10:29 -08:00
openopen2
a44c61b813 [HUDI-502] provide a custom time zone definition for TimestampBasedKeyGenerator (#1188) 2020-01-12 15:45:23 -08:00
Balaji Varadarajan
9b66ea41fd [HUDI-121] Remove leftover notice file and replace com.uber.hoodie with org.apache.hudi in log4j properties 2019-10-04 09:18:57 -07:00
Balaji Varadarajan
58623631d4 [HUDI-249] Update Release-notes. Add sign-artifacts to POM and release related scripts. Add missing license headers 2019-09-13 08:41:29 -07:00
Balaji Varadarajan
a4f9d7575f HUDI-123 Rename code packages/constants to org.apache.hudi (#830)
- Rename com.uber.hoodie to org.apache.hudi
- Flag to pass com.uber.hoodie Input formats for hoodie-sync
- Works with HUDI demo. 
- Also tested for backwards compatibility with datasets built by com.uber.hoodie packages
- Migration guide : https://cwiki.apache.org/confluence/display/HUDI/Migration+Guide+From+com.uber.hoodie+to+org.apache.hudi
2019-08-11 17:48:17 -07:00