1
0
Commit Graph

77 Commits

Author SHA1 Message Date
hongdd
5af3dc6aed [HUDI-331]Fix java docs for all public apis in HoodieWriteClient (#1111) 2020-01-09 16:00:53 +08:00
Wenning Ding
aba83876e7 Update deprecated HBase API 2020-01-08 10:26:47 -08:00
vinoth chandar
9706f659db [HUDI-508] Standardizing on "Table" instead of "Dataset" across code (#1197)
- Docs were talking about storage types before, cWiki moved to "Table"
 - Most of code already has HoodieTable, HoodieTableMetaClient - correct naming
 - Replacing renaming use of dataset across code/comments
 - Few usages in comments and use of Spark SQL DataSet remain unscathed
2020-01-07 12:52:32 -08:00
Balaji Varadarajan
8306f749a2 [HUDI-417] Refactor HoodieWriteClient so that commit logic can be shareable by both bootstrap and normal write operations (#1166) 2020-01-06 20:11:48 -08:00
hejinbiao123
b9fab0b933 Revert "[HUDI-455] Redo hudi-client log statements using SLF4J (#1145)" (#1181)
This reverts commit e637d9ed26.
2020-01-06 21:13:29 +08:00
Sivabalan Narayanan
7031445eb3 [HUDI-377] Adding Delete() support to DeltaStreamer (#1073)
- Provides ability to perform hard deletes by writing delete marker records into the source data
- if the record contains a special field _hoodie_delete_marker set to true, deletes are performed
2020-01-04 11:07:31 -08:00
SteNicholas
726ae47ce2 [MINOR]Optimize hudi-client module (#1139) 2020-01-04 10:57:08 -08:00
Pratyaksh Sharma
dde21e7315 [HUDI-402]: code clean up in test cases 2019-12-31 11:10:49 -08:00
hejinbiao123
e637d9ed26 [HUDI-455] Redo hudi-client log statements using SLF4J (#1145)
* [HUDI-455] Redo hudi-client log statements using SLF4J
2019-12-31 13:49:34 +08:00
lamber-ken
ab6ae5cebb [HUDI-482] Fix missing @Override annotation on methods (#1156)
* [HUDI-482] Fix missing @Override annotation on methods
2019-12-31 11:44:56 +08:00
dengziming
2a823f32ee [MINOR]: alter some wrong params which bring fatal exception 2019-12-30 16:50:12 -08:00
lamber-ken
8440482977 Fix empty content clean plan 2019-12-29 19:03:56 -08:00
lamber-ken
2f254163d4 Skip setting commit metadata 2019-12-29 19:03:56 -08:00
lamber-ken
179837e8ef Fix checkstyle 2019-12-29 19:03:56 -08:00
lamber-ken
58c5bed40a [HUDI-453] Fix throw failed to archive commits error when writing data to MOR/COW table 2019-12-29 19:03:56 -08:00
Mathieu
3c811ec29b [MINOR] fix typos 2019-12-25 20:26:16 +08:00
Sivabalan Narayanan
9c4217a3e1 [HUDI-389] Fixing Index look up to return right partitions for a given key along with fileId with Global Bloom (#1091)
* Fixing Index look up to return partitions for a given key along with fileId with Global Bloom
* Addressing some of the comments
* Fixing test in TestHoodieGlobalBloomIndex to test the fix
2019-12-24 20:56:30 -08:00
dengziming
94aec965f5 [minor] Fix few typos in the java docs (#1132) 2019-12-24 20:44:11 -08:00
Mathieu
41f36770e0 [MINOR] fix typo 2019-12-25 06:48:15 +08:00
Thinking Chen
8172197c35 Fix Error: java.lang.IllegalArgumentException: Can not create a Path from an empty string in HoodieCopyOnWrite#deleteFilesFunc (#1126)
same link in https://github.com/apache/incubator-hudi/pull/771
this time is in HoodieCopyOnWrite deleteFilesFunc method
2019-12-24 14:29:28 +08:00
Sivabalan Narayanan
14881e99e0 [HUDI-106] Adding support for DynamicBloomFilter (#976)
- Introduced configs for bloom filter type
- Implemented dynamic bloom filter with configurable max number of keys
- BloomFilterFactory abstractions; Defaults to current simple bloom filter
2019-12-17 19:06:24 -08:00
Balaji Varadarajan
9a1f698eef [HUDI-308] Avoid Renames for tracking state transitions of all actions on dataset 2019-12-15 21:26:30 -08:00
lamber-ken
ba514cfea0 [MINOR] Remove redundant plus operator (#1097) 2019-12-12 05:42:05 +08:00
Pratyaksh Sharma
3790b75e05 [HUDI-368] code clean up in TestAsyncCompaction class (#1050) 2019-12-11 05:52:41 +08:00
lamber-ken
d447e2d751 [checkstyle] Unify LOG form (#1092) 2019-12-10 19:23:38 +08:00
Wenning Ding
e555aa516d [HUDI-353] Add hive style partitioning path 2019-12-09 12:29:53 -08:00
lamber-ken
2745b7552f [HUDI-379] Refactor the codes based on new JavadocStyle code style rule (#1079) 2019-12-06 12:59:28 +08:00
lamber-ken
c06d89b648 [HUDI-378] Refactor the rest codes based on new ImportOrder code style rule (#1078) 2019-12-05 17:25:03 +08:00
lamber-ken
b3e0ebbc4a [checkstyle] Add ConstantName java checkstyle rule (#1066)
* add SimplifyBooleanExpression java checkstyle rule
* collapse empty tags in scalastyle file
2019-12-04 18:59:15 +08:00
leesf
98ab33bb6e [HUDI-294] Delete Paths written in Cleaner plan needs to be relative to partition-path (#1062)
[HUDI-294] Delete Paths written in Cleaner plan needs to be relative to partition-path
2019-12-03 10:11:03 -08:00
ForwardXu
0b52ae3ac2 [HUDI-209] Implement JMX metrics reporter (#1045) 2019-11-28 19:17:34 +08:00
lamber-ken
da8d1334ee [HUDI-373] Refactor hudi-client based on new ImportOrder code style rule (#1056) 2019-11-28 09:25:56 +08:00
Sivabalan Narayanan
c3355109b1 [HUDI-328] Adding delete api to HoodieWriteClient (#1004)
[HUDI-328]  Adding delete api to HoodieWriteClient and Spark DataSource
2019-11-22 15:05:25 -08:00
hongdd
7bc08cbfdc [HUDI-345] Fix used deprecated function (#1024)
- Schema.parse() with new Schema.Parser().parse
- FSDataOutputStream constructor
2019-11-22 03:32:09 -08:00
Pratyaksh Sharma
1e14390719 [HUDI-350]: updated default value of config.getCleanerCommitsRetained() in javadocs 2019-11-20 06:50:04 -08:00
谢磊
804e348d0e [HUDI-346] Set allowMultipleEmptyLines to false for EmptyLineSeparator rule (#1025) 2019-11-19 18:44:42 +08:00
Nishith Agarwal
f82e58994e - Ensure that rollback instant is always created before the next commit instant.
This especially affects IncrementalPull for MOR tables since we can end up pulling in
  log blocks for uncommitted data
- Ensure that generated commit instants are 1 second apart
2019-11-17 14:11:26 -08:00
lamber-ken
045fa87a3d [HUDI-330] add EmptyStatement java checkstyle rule 2019-11-13 14:11:11 -08:00
Balaji Varadarajan
8ff06ddb0f [HUDI-80] Leverage Commit metadata to figure out partitions to be cleaned for Cleaning by commits mode (#1008) 2019-11-12 06:12:44 -08:00
Balaji Varadarajan
1032fc3e54 [HUDI-137] Hudi cleaning state changes should be consistent with compaction actions
Before this change, Cleaner performs cleaning of old file versions and then stores the deleted files in .clean files.
With this setup, we will not be able to track file deletions if a cleaner fails after deleting files but before writing .clean metadata.
This is fine for regular file-system view generation but Incremental timeline syncing relies on clean/commit/compaction metadata to keep a consistent file-system view.

Cleaner state transitions is now similar to that of compaction.

1. Requested : HoodieWriteClient.scheduleClean() selects the list of files that needs to be deleted and stores them in metadata
2. Inflight : HoodieWriteClient marks the state to be inflight before it starts deleting
3. Completed : HoodieWriteClient marks the state after completing the deletion according to the cleaner plan
2019-11-11 10:40:16 -08:00
pratyakshsharma
0863b1cfd9 [HUDI-245]: replaced instances of getInstants() and reverse() with getReverseOrderedInstants() (#1000) 2019-11-07 08:42:48 -08:00
Balaji Varadarajan
c23da694cc [HUDI-169] Speed up rolling back of instants (#968) 2019-10-24 19:34:00 -07:00
Balaji Varadarajan
d8be818ac9 [HUDI-130] Paths written in compaction plan needs to be relative to base-path 2019-10-23 02:52:24 -07:00
vinoth chandar
e4c91ed13f [HUDI-290] Normalize test class name of all test classes (#951) 2019-10-22 20:19:11 -07:00
vinoth chandar
dfdc0e40e1 [HUDI-283] : Ensure a sane minimum for merge buffer memory (#964)
- Some environments e.g spark-shell provide 0 for memory size
- This causes unnecessary performance degradation
2019-10-20 21:00:04 -07:00
Balaji Varadarajan
77f4e73615 [HUDI-121] Fix licensing issues found during RC voting by general incubator group 2019-10-16 02:09:02 -07:00
Wenning Ding
1c09f5b055 [HUDI-301] fix path error when update a non-partition MOR table 2019-10-16 02:07:33 -07:00
Udit Mehrotra
12523c379f [HUDI-298] Fix issue with incorrect column mapping casusing bad data, during on-the-fly merge of Real Time tables (#956)
* Fix issue with incorrect column mapping casusing bad data, during on-the-fly merge of Real Time tables
2019-10-16 02:05:53 -07:00
leesf
b19bed442d [HUDI-296] Explore use of spotless to auto fix formatting errors (#945)
- Add spotless format fixing to project
- One time reformatting for conformity
- Build fails for formatting changes and mvn spotless:apply autofixes them
2019-10-10 05:19:40 -07:00
leesf
d050d98071 [HUDI-232] Implement sealing/unsealing for HoodieRecord class (#938) 2019-10-07 10:56:46 -07:00