1
0

Update RELEASE_NOTES for 0.4.5

This commit is contained in:
vinothchandar
2019-02-27 06:46:22 -08:00
committed by vinoth chandar
parent 75c7a2622b
commit 080b7d4d9b

View File

@@ -1,3 +1,63 @@
Release 0.4.5
------------------------------------
### Highlights
* Dockerized demo with support for different Hive versions
* Smoother handling of append log on cloud stores
* Introducing a global bloom index, that enforces unique constraint across partitions
* CLI commands to analyze workloads, manage compactions
* Migration guide for folks wanting to move datasets to Hudi
* Added Spark Structured Streaming support, with a Hudi sink
* In-built support for filtering duplicates in DeltaStreamer
* Support for plugging in custom transformation in DeltaStreamer
* Better support for non-partitioned Hive tables
* Support hard deletes for Merge on Read storage
* New slack url & site urls
* Added presto bundle for easier integration
* Tons of bug fixes, reliability improvements
### Full PR List
* **@bhasudha** - Create hoodie-presto bundle jar. fixes #567 #571
* **@bhasudha** - Close FSDataInputStream for meta file open in HoodiePartitionMetadata . Fixes issue #573 #574
* **@yaoqinn** - handle no such element exception in HoodieSparkSqlWriter #576
* **@vinothchandar** - Update site url in README
* **@yaooqinn** - typo: bundle jar with unrecognized variables #570
* **@bvaradar** - Table rollback for inflight compactions MUST not delete instant files at any time to avoid race conditions #565
* **@bvaradar** - Fix Hoodie Record Reader to work with non-partitioned dataset ( ISSUE-561) #569
* **@bvaradar** - Hoodie Delta Streamer Features : Transformation and Hoodie Incremental Source with Hive integration #485
* **@vinothchandar** - Updating new slack signup link #566
* **@yaooqinn** - Using immutable map instead of mutables to generate parameters #559
* **@n3nash** - Fixing behavior of buffering in Create/Merge handles for invalid/wrong schema records #558
* **@n3nash** - cleaner should now use commit timeline and not include deltacommits #539
* **@n3nash** - Adding compaction to HoodieClient example #551
* **@n3nash** - Filtering partition paths before performing a list status on all partitions #541
* **@n3nash** - Passing a path filter to avoid including folders under .hoodie directory as partition paths #548
* **@n3nash** - Enabling hard deletes for MergeOnRead table type #538
* **@msridhar** - Add .m2 directory to Travis cache #534
* **@artem0** - General enhancements #520
* **@bvaradar** - Ensure Hoodie works for non-partitioned Hive table #515
* **@xubo245** - fix some spell errorin Hudi #530
* **@leletan** - feat(SparkDataSource): add structured streaming sink #486
* **@n3nash** - Serializing the complete payload object instead of serializing just the GenericRecord in HoodieRecordConverter #495
* **@n3nash** - Returning empty Statues for an empty spark partition caused due to incorrect bin packing #510
* **@bvaradar** - Avoid WriteStatus collect() call when committing batch to prevent Driver side OOM errors #512
* **@vinothchandar** - Explicitly handle lack of append() support during LogWriting #511
* **@n3nash** - Fixing number of insert buckets to be generated by rounding off to the closest greater integer #500
* **@vinothchandar** - Enabling auto tuning of insert splits by default #496
* **@bvaradar** - Useful Hudi CLI commands to debug/analyze production workloads #477
* **@bvaradar** - Compaction validate, unschedule and repair #481
* **@shangxinli** - Fix addMetadataFields() to carry over 'props' #484
* **@n3nash** - Adding documentation for migration guide and COW vs MOR tradeoffs #470
* **@leletan** - Add additional feature to drop later arriving dups #468
* **@bvaradar** - Fix regression bug which broke HoodieInputFormat handling of non-hoodie datasets #482
* **@vinothchandar** - Add --filter-dupes to DeltaStreamer #478
* **@bvaradar** - A quickstart demo to showcase Hudi functionalities using docker along with support for integration-tests #455
* **@bvaradar** - Ensure Hoodie metadata folder and files are filtered out when constructing Parquet Data Source #473
* **@leletan** - Adds HoodieGlobalBloomIndex #438
Release 0.4.4
------------------------------------