lamber-ken
170ee88457
[HUDI-553] Building/Running Hudi on higher java versions ( #1369 )
2020-03-07 01:27:40 -08:00
vinoyang
ee5b32f5d4
[HUDI-652] Decouple HoodieReadClient and AbstractHoodieClient to break the inheritance chain ( #1372 )
...
* Removed timeline server support
* Removed try-with-resource
2020-03-06 09:59:35 -08:00
lamber-ken
ccbf543607
[HUDI-654] Rename hudi-hive to hudi-hive-sync
2020-03-06 22:13:16 +08:00
Udit Mehrotra
2d04014581
[HUDI-607] Fix to allow creation/syncing of Hive tables partitioned by Date type columns ( #1330 )
2020-03-01 10:42:58 -08:00
yanghua
0dc8e493aa
Moving to 0.6.0-SNAPSHOT on master branch.
2020-03-01 15:08:30 +08:00
Ramachandran M S
acf359c834
[HUDI-627] Aggregate code coverage and publish to codecov.io during CI ( #1347 )
2020-02-27 13:54:20 -08:00
vinoth chandar
71170fafe7
[HUDI-554] Cleanup package structure in hudi-client ( #1346 )
...
- Just package, class moves and renames with the following intent
- `client` now has all the various client classes, that do the transaction management
- `func` renamed to `execution` and some helpers moved to `client/utils`
- All compaction code under `io` now under `table/compact`
- Rollback code under `table/rollback` and in general all code for individual operations under `table`
- `exception` `config`, `metrics` left untouched
- Moved the tests also accordingly
- Fixed some flaky tests
2020-02-27 08:05:58 -08:00
lamber-ken
11fb2c2614
[HUDI-580] Fix incorrect license header in files
2020-02-25 08:54:26 -08:00
YanJia-Gary-Li
4e7fcde4a6
[HUDI-597] Enable incremental pulling from defined partitions ( #1348 )
2020-02-24 11:46:30 -08:00
Suneel Marthi
f9d2f66dc1
[HUDI-622]: Remove VisibleForTesting annotation and import from code ( #1343 )
...
* HUDI:622: Remove VisibleForTesting annotation and import from code
2020-02-20 15:17:53 +08:00
lamber-ken
425e3e6c78
[HUDI-585] Optimize the steps of building with scala-2.12 ( #1293 )
2020-02-05 23:13:10 +08:00
Suneel Marthi
5b7bb142dc
[HUDI-583] Code Cleanup, remove redundant code, and other changes ( #1237 )
2020-02-02 18:03:44 +08:00
leesf
652224edc8
[HUDI-578] Trim recordKeyFields and partitionPathFields in ComplexKeyGenerator ( #1281 )
...
* [HUDI-578] Trim recordKeyFields and partitionPathFields in ComplexKeyGenerator
* add tests
2020-01-29 16:26:26 -08:00
leesf
6e59c1c777
Moving to 0.5.2-SNAPSHOT on master branch.
2020-01-20 10:51:33 -08:00
Y Ethan Guo
d0ee95ed16
[HUDI-552] Fix the schema mismatch in Row-to-Avro conversion ( #1246 )
2020-01-18 16:40:56 -08:00
wenningd
292c1e2ff4
[HUDI-238] Make Hudi support Scala 2.12 ( #1226 )
...
* [HUDI-238] Rename scala related artifactId & add maven profile to support Scala 2.12
2020-01-17 14:02:21 -08:00
Prashant Wason
0a07752dc0
[HUDI-527] scalastyle-maven-plugin moved to pluginManagement as it is only used in hoodie-spark and hoodie-cli modules.
...
This fixes compile warnings as well as unnecessary plugin invocation for most of the modules which do not have scala code.
2020-01-17 10:46:10 -08:00
vinoth chandar
c2c0f6b13d
[HUDI-509] Renaming code in sync with cWiki restructuring ( #1212 )
...
- Storage Type replaced with Table Type (remaining instances)
- View types replaced with query types;
- ReadOptimized view referred as Snapshot Query
- TableFileSystemView sub interfaces renamed to BaseFileOnly and Slice Views
- HoodieDataFile renamed to HoodieBaseFile
- Hive Sync tool will register RO tables for MOR with a `_ro` suffix
- Datasource/Deltastreamer options renamed accordingly
- Support fallback to old config values as well, so migration is painless
- Config for controlling _ro suffix addition
- Renaming DataFile to BaseFile across DTOs, HoodieFileSlice and AbstractTableFileSystemView
2020-01-16 23:58:47 -08:00
Scheller
1daba24065
Add GlobalDeleteKeyGenerator
...
Adds new GlobalDeleteKeyGenerator for record_key deletes with global indices. Also refactors key generators into their own package.
2020-01-15 17:01:29 -08:00
Sivabalan Narayanan
2248fd9aea
Fixing checkstyle issues
2020-01-15 14:21:26 -08:00
Sivabalan Narayanan
2b2f23aa60
Fixing delete util method
2020-01-15 14:21:26 -08:00
Sivabalan Narayanan
87fdb769f0
Adding util methods to assist in adding deletion support to Quick Start
2020-01-15 14:21:26 -08:00
Mehrotra
2bb0c21a3d
Fix conversion of Spark struct type to Avro schema
...
cr https://code.amazon.com/reviews/CR-17184364
2020-01-14 00:27:56 -08:00
Udit Mehrotra
ad50008a59
[HUDI-91][HUDI-12]Migrate to spark 2.4.4, migrate to spark-avro library instead of databricks-avro, add support for Decimal/Date types
...
- Upgrade Spark to 2.4.4, Parquet to 1.10.1, Avro to 1.8.2
- Remove spark-avro from hudi-spark-bundle. Users need to provide --packages org.apache.spark:spark-avro:2.4.4 when running spark-shell or spark-submit
- Replace com.databricks:spark-avro with org.apache.spark:spark-avro
- Shade avro in hudi-hadoop-mr-bundle to make sure it does not conflict with hive's avro version.
2020-01-12 15:03:11 -08:00
lamber-ken
d9675c4ec0
[HUDI-522] Use the same version jcommander uniformly ( #1214 )
2020-01-12 10:48:52 -08:00
pratyakshsharma
3c90d252cc
[HUDI-114]: added option to overwrite payload implementation in hoodie.properties file
2020-01-09 22:34:40 -08:00
Y Ethan Guo
480fc7869d
[HUDI-319] Add a new maven profile to generate unified Javadoc for all Java and Scala classes ( #1195 )
...
* Add javadoc build command in README, links to javadoc plugin and rename profile.
* Make java version configurable in one place.
2020-01-08 10:38:09 -08:00
vinoth chandar
9706f659db
[HUDI-508] Standardizing on "Table" instead of "Dataset" across code ( #1197 )
...
- Docs were talking about storage types before, cWiki moved to "Table"
- Most of code already has HoodieTable, HoodieTableMetaClient - correct naming
- Replacing renaming use of dataset across code/comments
- Few usages in comments and use of Spark SQL DataSet remain unscathed
2020-01-07 12:52:32 -08:00
lamber-ken
75c3f630d4
[HUDI-405] Remove HIVE_ASSUME_DATE_PARTITION_OPT_KEY config from DataSource
2020-01-06 14:25:38 -08:00
Pratyaksh Sharma
8f935e779a
[HUDI-406]: added default partition path in TimestampBasedKeyGenerator
2020-01-06 09:38:06 -08:00
hongdd
2d5b79d96f
[HUDI-438] Merge duplicated code fragment in HoodieSparkSqlWriter ( #1114 )
2020-01-06 22:51:22 +08:00
Sivabalan Narayanan
7031445eb3
[HUDI-377] Adding Delete() support to DeltaStreamer ( #1073 )
...
- Provides ability to perform hard deletes by writing delete marker records into the source data
- if the record contains a special field _hoodie_delete_marker set to true, deletes are performed
2020-01-04 11:07:31 -08:00
Pratyaksh Sharma
dde21e7315
[HUDI-402]: code clean up in test cases
2019-12-31 11:10:49 -08:00
vinoth chandar
350b0ecb4d
[HUDI-311] : Support for AWS Database Migration Service in DeltaStreamer
...
- Add a transformer class, that adds `Op` fiels if not found in input frame
- Add a payload implementation, that issues deletes when Op=D
- Remove Parquet as a top level source type, consolidate with RowSource
- Made delta streamer work without a property file, simply using overridden cli options
- Unit tests for transformer/payload classes
2019-12-23 20:56:55 -08:00
lamber-ken
313fab5fd1
[HUDI-444] Refactor the codes based on scala codestyle ReturnChecker rule ( #1121 )
2019-12-24 07:05:54 +08:00
YanJia-Gary-Li
36b3b6f5dd
[HUDI-415] Get commit time when Spark start ( #1113 )
2019-12-19 22:19:06 -08:00
lamber-ken
a405d3873b
[MINOR] replace scala map add operator ( #1093 )
...
replace ++: with ++
2019-12-12 11:29:17 +08:00
lamber-ken
ba514cfea0
[MINOR] Remove redundant plus operator ( #1097 )
2019-12-12 05:42:05 +08:00
lamber-ken
d447e2d751
[checkstyle] Unify LOG form ( #1092 )
2019-12-10 19:23:38 +08:00
Wenning Ding
e555aa516d
[HUDI-353] Add hive style partitioning path
2019-12-09 12:29:53 -08:00
lamber-ken
2745b7552f
[HUDI-379] Refactor the codes based on new JavadocStyle code style rule ( #1079 )
2019-12-06 12:59:28 +08:00
hongdd
b65a897856
[HUDI-374] Unable to generateUpdates in QuickstartUtils ( #1059 )
2019-11-30 11:11:00 -08:00
lamber-ken
024230fbd2
[HUDI-372] Support the shortName for Hudi DataSource ( #1054 )
...
- Ability to do `spark.write.format("hudi")...`
2019-11-30 08:02:33 -08:00
谢磊
f9139c0f61
[HUDI-366] Refactor some module codes based on new ImportOrder code style rule ( #1055 )
...
[HUDI-366] Refactor hudi-hadoop-mr / hudi-timeline-service / hudi-spark / hudi-integ-test / hudi- utilities based on new ImportOrder code style rule
2019-11-27 21:32:43 +08:00
bschell
60fed21dc7
[HUDI-327] Add null/empty checks to key generators ( #1040 )
...
* Adds null and empty checks to all key generators.
* Also improves error messaging for key generator issues.
2019-11-26 02:37:16 -08:00
filippo balicchia
845a0509b3
[MINOR] Some minor optimizations in HoodieJavaStreamingApp ( #1046 )
2019-11-25 18:49:13 +08:00
Sivabalan Narayanan
c3355109b1
[HUDI-328] Adding delete api to HoodieWriteClient ( #1004 )
...
[HUDI-328] Adding delete api to HoodieWriteClient and Spark DataSource
2019-11-22 15:05:25 -08:00
hongdd
7bc08cbfdc
[HUDI-345] Fix used deprecated function ( #1024 )
...
- Schema.parse() with new Schema.Parser().parse
- FSDataOutputStream constructor
2019-11-22 03:32:09 -08:00
谢磊
804e348d0e
[HUDI-346] Set allowMultipleEmptyLines to false for EmptyLineSeparator rule ( #1025 )
2019-11-19 18:44:42 +08:00
vinoth chandar
e4c91ed13f
[HUDI-290] Normalize test class name of all test classes ( #951 )
2019-10-22 20:19:11 -07:00