1
0
Commit Graph

49 Commits

Author SHA1 Message Date
vinothchandar
8f1d362015 Fixing deps & serialization for RTView
- hoodie-hadoop-mr now needs objectsize bundled
 - Also updated docs with additional tuning tips
2018-06-10 19:16:44 -07:00
Vinoth Chandar
85dd265b7b Improving out of box experience for data source
- Fixes #246
 - Bump up default parallelism to 1500, to handle large upserts
 - Add docs on s3 confuration & tuning tips with tested spark knobs
 - Fix bug to not duplicate hoodie metadata fields when input dataframe is another hoodie dataset
 - Improve speed of ROTablePathFilter by removing directory check
 - Move to spark-avro 4.0 to handle issue with nested fields with same name
 - Keep AvroConversionUtils in sync with spark-avro 4.0
2018-06-10 19:16:44 -07:00
Balaji Varadarajan
c66004d79a Add Support for ordering and limiting results in CLI show commands 2018-04-26 09:30:05 -07:00
vinoth chandar
fa73a911cc Update Gemfile.lock 2018-04-19 14:20:50 -07:00
Balaji Varadarajan
788e4f2d2e CodeStyle formatting to conform to basic Checkstyle rules.
The code-style rules follow google style with some changes:

1. Increase line length from 100 to 120
2. Disable JavaDoc related checkstyles as this needs more manual work.

Both source and test code are checked for code-style
2018-03-30 11:09:40 -07:00
Nishith Agarwal
987f5d6b96 Making ExternalSpillableMap generic for any datatype
- Introduced concept of converters to be able to serde generic datatype for SpillableMap
	- Fixed/Added configs to Hoodie Configs
	- Changed HoodieMergeHandle to start using SpillableMap
2018-03-28 07:56:07 -07:00
Nishith Agarwal
5405a6287b Introducing HoodieLogFormat V2 with versioning support
- HoodieLogFormat V2 has support for LogFormat evolution through versioning
			- LogVersion is associated with a LogBlock not a LogFile
			- Based on a version for a LogBlock, approporiate code path is executed
		- Implemented LazyReading of Hoodie Log Blocks with Memory / IO tradeoff
		- Implemented Reverse pointer to be able to traverse the log in reverse
		- Introduce new MAGIC for backwards compatibility with logs without versions
2018-03-06 21:14:11 -08:00
Nishith Agarwal
d495484399 Write smaller sized multiple blocks to log file instead of a large one
- Use SizeEstimator to size number of records to write
	- Configurable block size
   	- Configurable log file size
2018-02-23 07:31:39 -08:00
Vinoth Chandar
85d32930cd Update Gemfile.lock 2018-01-18 00:07:23 -08:00
Vinoth Chandar
5a62480a92 Update docs on code style setup 2017-11-12 23:19:02 -08:00
Vinoth Chandar
274aaf49fe Incorporating code review feedback for DataSource 2017-10-02 20:44:53 -07:00
Vinoth Chandar
64e0573aca Adding hoodie-spark to support Spark Datasource for Hoodie
- Write with COW/MOR paths work fully
 - Read with RO view works on both storages*
 - Incremental view supported on COW
 - Refactored out HoodieReadClient methods, to just contain key based access
 - HoodieDataSourceHelpers class can be now used to construct inputs to datasource
 - Tests in hoodie-client using new helpers and mechanisms
 - Basic tests around save modes & insert/upserts (more to follow)
 - Bumped up scala to 2.11, since 2.10 is deprecated & complains with scalatest
 - Updated documentation to describe usage
 - New sample app written using the DataSource API
2017-10-02 20:44:53 -07:00
Vinoth Chandar
86209640f7 Adding range based pruning to bloom index
- keys compared lexicographically using String::compareTo
 - Range metadata additionally written into parquet file footers
 - Trim fat & few optimizations to speed up indexing
 - Add param to control whether input shall be cached, to speed up lookup
 - Add param to turn on/off range pruning
 - Auto compute of parallelism now simply factors in amount of comparisons done
 - More accurate parallelism computation when range pruning is on
 - tests added & hardened, docs updated
2017-08-04 13:22:13 -07:00
Vinoth Chandar
cf1dde0323 Add recent talks/presentations to documentation 2017-07-08 22:47:15 -07:00
Vinoth Chandar
e8b3ddd7cb Add note on community engagement to committership guidelines 2017-07-08 22:47:15 -07:00
Prasanna Rajaperumal
e44f9b889b Added CHANGELOG.md and updated community contributions guideline 2017-06-16 10:48:37 -07:00
Prasanna Rajaperumal
4b26be9f61 Fixes to RealtimeInputFormat and RealtimeRecordReader and update documentation for HiveSyncTool 2017-06-15 18:21:07 -07:00
Zeeshan Qureshi
43a55b09fd Add GCS to supported filesystems 2017-05-18 10:30:34 -07:00
vinoth chandar
1b0a027942 Update community.md with committership guidelines 2017-05-04 17:25:54 -07:00
Vinoth Chandar
b4e787ce1d Update docs 2017-05-01 21:48:27 -07:00
Vinoth Chandar
848814bece Adding docs for deltastreamer, hivesync tool usage 2017-04-03 21:27:49 -07:00
Vinoth Chandar
2b6322318c CR feedback 2017-04-03 18:28:01 -07:00
Vinoth Chandar
e0fc4ec38e Documentation update + helper method for WriteConfig builder 2017-04-03 18:28:01 -07:00
Yash Sharma
e3b273e9fd formatting for docs 2017-03-28 05:08:54 -07:00
Yash Sharma
bca7e7dae4 improve documentations 2017-03-28 05:08:54 -07:00
Yash Sharma
d6f94b998d Hoodie operability with S3 2017-03-28 05:08:54 -07:00
prazanna
a7cd021f26 Update incremental pull query documentation 2017-03-23 16:20:54 -07:00
ovj
b02910c588 few fixes to quick start document (#112) 2017-03-22 18:25:26 -07:00
Vinoth Chandar
c3257b9680 Fix ugly favicon 2017-03-12 20:30:42 -07:00
Vinoth Chandar
b252633fab Add analytics to site 2017-03-12 20:30:42 -07:00
Prasanna Rajaperumal
41e08018fc Fixing minor documentation fixes 2017-03-02 11:42:04 -08:00
Prasanna Rajaperumal
d84aea3512 Fixing minor documentation fixes 2017-03-02 11:39:40 -08:00
prazanna
8a2a9ae764 Making minor documentation fixes 2017-03-02 11:35:09 -08:00
Vinoth Chandar
33a85900f8 Adding admin guide, guide for sql queries and incr processing 2017-02-19 20:33:21 -08:00
Vinoth Chandar
dcc15d5d6f Adding docs for running sql queries on hoodie datasets 2017-02-19 20:33:21 -08:00
vinoth chandar
66e272e9eb Docs for performance section (#80)
* Adding performance section

* minor edit to perf section
2017-02-17 18:30:56 -08:00
vinoth chandar
c7a8e15c78 Docs for impl & comparison (#79)
* Initial version of comparison, implementation

* Finished doc for comparison to other systems
2017-02-17 08:25:17 -08:00
Prasanna Rajaperumal
a725382549 Add Configurations to Documentation 2017-02-06 14:12:18 -08:00
vinoth chandar
186d70713f Adding documentation around Hoodie concepts (#71) 2017-01-29 15:22:57 -08:00
Vinoth Chandar
40a63fcab4 Shorten README and point to site 2017-01-09 11:30:46 -08:00
vinoth chandar
1559a3826f Minor fixes to use_cases.md 2017-01-06 00:06:15 -08:00
Vinoth Chandar
5a7d408c3c Fixing the eyesore red font color over sidebar for docs 2017-01-05 23:55:53 -08:00
vinoth chandar
64858239d1 Update use_cases.md 2017-01-04 23:58:09 -08:00
Vinoth Chandar
958f7ceda6 Adding Documentation for Getting Started Section
- Overview, Use Cases, Powered By  are very detailed
 - Cleaned up QuickStart
 - Redistribute the content from README to correct pages to be improved upon
 - Switch to blue theme
2017-01-04 20:50:44 -08:00
prazanna
040c7aa766 Delete CNAME 2017-01-03 18:24:36 -08:00
prazanna
7da5e9a03a Create CNAME 2017-01-03 17:26:28 -08:00
vinoth chandar
d525687512 Delete CNAME 2016-12-30 11:26:44 -08:00
vinoth chandar
e73686be69 Create CNAME 2016-12-30 11:24:24 -08:00
Vinoth Chandar
2bf0db14c6 Adding docs folder, with skeleton jekyll based site
- Uses https://github.com/tomjohnson1492/documentation-theme-jekyll
 - Have filler pages
2016-12-30 11:05:22 -08:00