2.5 KiB
title, keywords, sidebar, permalink, toc
| title | keywords | sidebar | permalink | toc |
|---|---|---|---|---|
| Talks & Powered By | talks | mydoc_sidebar | powered_by.html | false |
Adoption
Uber
Hoodie was originally developed at Uber, to achieve low latency database ingestion, with high efficiency. It has been in production since Aug 2016, powering ~100 highly business critical tables on Hadoop, worth 100s of TBs(including top 10 including trips,riders,partners). It also powers several incremental Hive ETL pipelines and being currently integrated into Uber's data dispersal system.
Talks & Presentations
-
"Hoodie: Incremental processing on Hadoop at Uber" - By Vinoth Chandar & Prasanna Rajaperumal Mar 2017, Strata + Hadoop World, San Jose, CA
-
"Hoodie: An Open Source Incremental Processing Framework From Uber" - By Vinoth Chandar. Apr 2017, DataEngConf, San Francisco, CA Slides Video
-
"Incremental Processing on Large Analytical Datasets" - By Prasanna Rajaperumal June 2017, Spark Summit 2017, San Francisco, CA. Slides Video
-
"Hudi: Unifying storage and serving for batch and near-real-time analytics" - By Nishith Agarwal & Balaji Vardarajan September 2018, Strata Data Conference, New York, NY
-
["Hudi: Large-Scale, Near Real-Time Pipelines at Uber"](https://databricks .com/session/hudi-near-real-time-spark-pipelines-at-petabyte-scale) - By Vinoth Chander & Nishith Agarwal October 2018, Spark+AI Summit Europe, London, UK
Articles
- "The Case for incremental processing on Hadoop" - O'reilly Ideas article by Vinoth Chandar
- "Hoodie: Uber Engineering's Incremental Processing Framework on Hadoop" - Engineering Blog By Prasanna Rajaperumal