Suneel Marthi
078d4825d9
[HUDI-624]: Split some of the code from PR for HUDI-479 ( #1344 )
2020-02-21 14:22:21 +08:00
Suneel Marthi
f9d2f66dc1
[HUDI-622]: Remove VisibleForTesting annotation and import from code ( #1343 )
...
* HUDI:622: Remove VisibleForTesting annotation and import from code
2020-02-20 15:17:53 +08:00
amitsingh-10
c2b08cdfc9
[HUDI-617] Add support for types implementing CharSequence ( #1339 )
...
- Data types extending CharSequence implement a #toString method which provides an easy way to convert them to String.
- For example, org.apache.avro.util.Utf8 is easily convertible into String if we use the toString() method. It's better to make the support more generic to support a wider range of data types as partitionKey.
2020-02-18 11:19:44 -08:00
Mathieu
8c6138cb01
[MINOR] Add javadoc to SchedulerConfGenerator and code clean ( #1340 )
2020-02-18 11:15:02 -08:00
wangxianghu
aaa6cf9a98
[MINOR] Fix some typos
2020-02-15 09:49:25 +08:00
openopen2
dfbee673ef
[HUDI-514] A schema provider to get metadata through Jdbc ( #1200 )
2020-02-13 18:06:06 -08:00
Mathieu
175de0db7b
[MINOR] Fix typo ( #1331 )
2020-02-13 10:46:10 -08:00
Mathieu
5fdf5a1927
[HUDI-560] Remove legacy IdentityTransformer ( #1264 )
2020-02-10 10:04:58 +08:00
lamber-ken
46842f4e92
[MINOR] Remove the declaration of thrown RuntimeException ( #1305 )
2020-02-05 23:23:20 +08:00
Suneel Marthi
594da28fbf
[HUDI-595] code cleanup, refactoring code out of PR# 1159 ( #1302 )
2020-02-04 21:52:03 +08:00
dengziming
347e297ac1
[HUDI-596] Close KafkaConsumer every time ( #1303 )
2020-02-03 23:42:21 -08:00
Suneel Marthi
5b7bb142dc
[HUDI-583] Code Cleanup, remove redundant code, and other changes ( #1237 )
2020-02-02 18:03:44 +08:00
leesf
ed54eb20a5
[MINOR] Add missing licenses ( #1271 )
2020-01-22 08:06:45 -05:00
Y Ethan Guo
9489d0fb84
[HUDI-551] Abstract a test case class for DFS Source to make it extensible ( #1239 )
2020-01-19 18:50:12 +08:00
Y Ethan Guo
d0ee95ed16
[HUDI-552] Fix the schema mismatch in Row-to-Avro conversion ( #1246 )
2020-01-18 16:40:56 -08:00
wenningd
292c1e2ff4
[HUDI-238] Make Hudi support Scala 2.12 ( #1226 )
...
* [HUDI-238] Rename scala related artifactId & add maven profile to support Scala 2.12
2020-01-17 14:02:21 -08:00
vinoth chandar
c2c0f6b13d
[HUDI-509] Renaming code in sync with cWiki restructuring ( #1212 )
...
- Storage Type replaced with Table Type (remaining instances)
- View types replaced with query types;
- ReadOptimized view referred as Snapshot Query
- TableFileSystemView sub interfaces renamed to BaseFileOnly and Slice Views
- HoodieDataFile renamed to HoodieBaseFile
- Hive Sync tool will register RO tables for MOR with a `_ro` suffix
- Datasource/Deltastreamer options renamed accordingly
- Support fallback to old config values as well, so migration is painless
- Config for controlling _ro suffix addition
- Renaming DataFile to BaseFile across DTOs, HoodieFileSlice and AbstractTableFileSystemView
2020-01-16 23:58:47 -08:00
Y Ethan Guo
b39458b008
[MINOR] Make constant fields final in HoodieTestDataGenerator ( #1234 )
2020-01-16 12:42:30 +08:00
Scheller
1daba24065
Add GlobalDeleteKeyGenerator
...
Adds new GlobalDeleteKeyGenerator for record_key deletes with global indices. Also refactors key generators into their own package.
2020-01-15 17:01:29 -08:00
Mehrotra
2bb0c21a3d
Fix conversion of Spark struct type to Avro schema
...
cr https://code.amazon.com/reviews/CR-17184364
2020-01-14 00:27:56 -08:00
lamber-ken
fd8f1c70c0
[MINOR] Reuse random object ( #1222 )
2020-01-13 18:26:04 -08:00
openopen2
a44c61b813
[HUDI-502] provide a custom time zone definition for TimestampBasedKeyGenerator ( #1188 )
2020-01-12 15:45:23 -08:00
harveyyue
971c7d41bd
[HUDI-322] DeltaSteamer should pick checkpoints off only deltacommits for MOR tables
2020-01-12 15:11:47 -08:00
lamber-ken
d9675c4ec0
[HUDI-522] Use the same version jcommander uniformly ( #1214 )
2020-01-12 10:48:52 -08:00
pratyakshsharma
3c90d252cc
[HUDI-114]: added option to overwrite payload implementation in hoodie.properties file
2020-01-09 22:34:40 -08:00
vinoth chandar
9706f659db
[HUDI-508] Standardizing on "Table" instead of "Dataset" across code ( #1197 )
...
- Docs were talking about storage types before, cWiki moved to "Table"
- Most of code already has HoodieTable, HoodieTableMetaClient - correct naming
- Replacing renaming use of dataset across code/comments
- Few usages in comments and use of Spark SQL DataSet remain unscathed
2020-01-07 12:52:32 -08:00
lamber-ken
75c3f630d4
[HUDI-405] Remove HIVE_ASSUME_DATE_PARTITION_OPT_KEY config from DataSource
2020-01-06 14:25:38 -08:00
Pratyaksh Sharma
8f935e779a
[HUDI-406]: added default partition path in TimestampBasedKeyGenerator
2020-01-06 09:38:06 -08:00
lamber-ken
28ccf8c521
[HUDI-484] Fix NPE when reading IncrementalPull.sqltemplate in HiveIncrementalPuller ( #1167 )
2020-01-04 23:53:47 -08:00
Sivabalan Narayanan
7031445eb3
[HUDI-377] Adding Delete() support to DeltaStreamer ( #1073 )
...
- Provides ability to perform hard deletes by writing delete marker records into the source data
- if the record contains a special field _hoodie_delete_marker set to true, deletes are performed
2020-01-04 11:07:31 -08:00
Pratyaksh Sharma
290278fc6c
[HUDI-118]: Options provided for passing properties to Cleaner, compactor and importer commands
2020-01-03 16:00:57 -08:00
lamber-ken
e1e5fe3324
[MINOR] Fix error usage of String.format ( #1169 )
2020-01-02 09:11:15 +08:00
Pratyaksh Sharma
dde21e7315
[HUDI-402]: code clean up in test cases
2019-12-31 11:10:49 -08:00
lamber-ken
ab6ae5cebb
[HUDI-482] Fix missing @Override annotation on methods ( #1156 )
...
* [HUDI-482] Fix missing @Override annotation on methods
2019-12-31 11:44:56 +08:00
yungthuis66
f20a130e3a
[MINOR] typo fix ( #1142 )
2019-12-26 09:03:43 -08:00
vinoth chandar
350b0ecb4d
[HUDI-311] : Support for AWS Database Migration Service in DeltaStreamer
...
- Add a transformer class, that adds `Op` fiels if not found in input frame
- Add a payload implementation, that issues deletes when Op=D
- Remove Parquet as a top level source type, consolidate with RowSource
- Made delta streamer work without a property file, simply using overridden cli options
- Unit tests for transformer/payload classes
2019-12-23 20:56:55 -08:00
lamber-ken
ba514cfea0
[MINOR] Remove redundant plus operator ( #1097 )
2019-12-12 05:42:05 +08:00
lamber-ken
d447e2d751
[checkstyle] Unify LOG form ( #1092 )
2019-12-10 19:23:38 +08:00
Wenning Ding
e555aa516d
[HUDI-353] Add hive style partitioning path
2019-12-09 12:29:53 -08:00
lamber-ken
2745b7552f
[HUDI-379] Refactor the codes based on new JavadocStyle code style rule ( #1079 )
2019-12-06 12:59:28 +08:00
lamber-ken
b3e0ebbc4a
[checkstyle] Add ConstantName java checkstyle rule ( #1066 )
...
* add SimplifyBooleanExpression java checkstyle rule
* collapse empty tags in scalastyle file
2019-12-04 18:59:15 +08:00
谢磊
f9139c0f61
[HUDI-366] Refactor some module codes based on new ImportOrder code style rule ( #1055 )
...
[HUDI-366] Refactor hudi-hadoop-mr / hudi-timeline-service / hudi-spark / hudi-integ-test / hudi- utilities based on new ImportOrder code style rule
2019-11-27 21:32:43 +08:00
谢磊
b77fad39b5
[HUDI-364] Refactor hudi-hive based on new ImportOrder code style rule ( #1048 )
...
[HUDI-364] Refactor hudi-hive based on new ImportOrder code style rule
2019-11-27 16:30:37 +08:00
bschell
60fed21dc7
[HUDI-327] Add null/empty checks to key generators ( #1040 )
...
* Adds null and empty checks to all key generators.
* Also improves error messaging for key generator issues.
2019-11-26 02:37:16 -08:00
Pratyaksh Sharma
2a4cfb47c7
[HUDI-340]: made max events to read from kafka source configurable ( #1039 )
2019-11-26 18:34:02 +08:00
谢磊
804e348d0e
[HUDI-346] Set allowMultipleEmptyLines to false for EmptyLineSeparator rule ( #1025 )
2019-11-19 18:44:42 +08:00
Pratyaksh Sharma
5f1309407a
[HUDI-253]: added validations for schema provider class ( #995 )
2019-11-11 06:03:44 -08:00
Gurudatt Kulkarni
71ac2c0d5e
[HUDI-324] TimestampKeyGenerator should support milliseconds ( #993 )
2019-11-05 04:22:14 -08:00
Raymond Xu
91740635b2
[HUDI-321] Support bulkinsert in HDFSParquetImporter ( #987 )
...
- Add bulk insert feature
- Fix some minor issues
2019-11-02 23:12:44 -07:00
Balaji Varadarajan
a6390aefc4
[HUDI-312] Make docker hdfs cluster ephemeral. This is needed to fix flakiness in integration tests. Also, Fix DeltaStreamer hanging issue due to uncaught exception
2019-11-01 11:49:59 -07:00