- Fixing packaging, naming of classes
- Use of log4j over slf4j for uniformity
- More follow-on fixes
- Added a version to control/coordinator events.
- Eliminated the config added to write config
- Fixed fetching of checkpoints based on table type
- Clean up of naming, code placement
Co-authored-by: Rajesh Mahindra <rmahindra@Rajeshs-MacBook-Pro.local>
Co-authored-by: Vinoth Chandar <vinoth@apache.org>
- Added upgrade and downgrade step to and from 0.9.0. Upgrade adds few table properties. Downgrade recreates timeline server based marker files if any.
- Added two sources for two stage pipeline. a. S3EventsSource that fetches events from SQS and ingests to a meta hoodie table. b. S3EventsHoodieIncrSource reads S3 events from this meta hoodie table, fetches actual objects from S3 and ingests to sink hoodie table.
- Added selectors to assist in S3EventsSource.
Co-authored-by: Satish M <84978833+satishmittal1111@users.noreply.github.com>
Co-authored-by: Vinoth Chandar <vinoth@apache.org>
* Adding support to ingest records with old schema after table's schema is evolved
* Rebasing against latest master
- Trimming test file to be < 800 lines
- Renaming config names
* Addressing feedback
Co-authored-by: Vinoth Chandar <vinoth@apache.org>
* [HUDI-1848] Adding support for HMS for running DDL queries in hive-sync-tool
* [HUDI-1848] Fixing test cases
* [HUDI-1848] CR changes
* [HUDI-1848] Fix checkstyle violations
* [HUDI-1848] Fixed a bug when metastore api fails for complex schemas with multiple levels.
* [HUDI-1848] Adding the complex schema and resolving merge conflicts
* [HUDI-1848] Adding some more javadocs
* [HUDI-1848] Added javadocs for DDLExecutor impls
* [HUDI-1848] Fixed style issue
* [HUDI-1944] Support Hudi to read from committed offset
* [HUDI-1944] Adding group option to KafkaResetOffsetStrategies
* [HUDI-1944] Update Exception msg
[global-hive-sync-tool] Add a global hive sync tool to sync hudi table across clusters. Add a way to rollback the replicated time stamp if we fail to sync or if we partly sync
Co-authored-by: Jagmeet Bali <jsbali@uber.com>
As discussed in RFC-14, this change implements the first phase of JDBC incremental puller.
It consists following changes:
- JdbcSource: This class extends RowSource and implements
fetchNextBatch(Option<String> lastCkptStr, long sourceLimit)
- SqlQueryBuilder: A simple utility class to build sql queries fluently.
- Implements two modes of fetching: full and incremental.
Full is a complete scan of RDBMS table.
Incremental is delta since last checkpoint.
Incremental mode falls back to full fetch in case of any exception.