1
0
Commit Graph

77 Commits

Author SHA1 Message Date
Balaji Varadarajan
9b66ea41fd [HUDI-121] Remove leftover notice file and replace com.uber.hoodie with org.apache.hudi in log4j properties 2019-10-04 09:18:57 -07:00
leesf
7dd9c74b1b [HUDI-285] Implement HoodieStorageWriter based on actual file type (#936) 2019-10-04 07:45:16 -07:00
leesf
3dedc7e5fd [HUDI-265] Failed to delete tmp dirs created in unit tests (#928) 2019-10-03 09:48:13 -07:00
Balaji Varadarajan
6da2f9ac7c [HUDI-287] Address comments during review of release candidate
1. Remove LICENSE and NOTICE files in hoodie child modules.
  2. Remove developers and contributor section from pom
  3. Also ensure any failures in validation script is reported appropriately
  4. Make hoodie parent pom consistent with that of its parent apache-21 (https://github.com/apache/maven-apache-parent/blob/apache-21/pom.xml)
2019-10-03 09:00:07 -07:00
Balaji Varadarajan
6e8a28bcae HUDI-121 : Address comments during RC2 voting
1. Remove dnl utils jar from git
2. Add LICENSE Headers in missing files
3. Fix NOTICE and LICENSE in all HUDI packages and in top-level
4. Fix License wording in certain HUDI source files
5. Include non java/scala code in RAT licensing check
6. Use whitelist to include dependencies as part of timeline-server bundling
2019-09-30 15:42:15 -07:00
vinoyang
01e803b00e [HUDI-247] Unify the re-initialization of HoodieTableMetaClient in test for hoodie-client module (#930) 2019-09-30 05:38:52 -07:00
Balaji Varadarajan
2ea8b0c3f1 [HUDI-279] Fix regression in Schema Evolution due to PR-755 2019-09-25 22:53:43 -07:00
vinoyang
f020d029c4 HUDI-267 Refactor bad method name HoodieTestUtils#initTableType and HoodieTableMetaClient#initializePathAsHoodieDataset (#916) 2019-09-21 09:05:02 -07:00
Balaji Varadarajan
2c6da09d9d [HUDI-257] Fix Bloom Index unit-test failures 2019-09-17 09:41:15 -07:00
Taher Koitawala
c0f42afa35 [HUDI-62] Index Lookup Timer added to HoodieWriteClient 2019-09-16 11:12:52 -07:00
Balaji Varadarajan
7190c022bb [HUDI-249] Updating Notice files 2019-09-13 13:50:58 -07:00
Nishith Agarwal
0b032b2761 Fix requested eompaction rollback during restore command 2019-09-13 12:40:13 -07:00
Balaji Varadarajan
58623631d4 [HUDI-249] Update Release-notes. Add sign-artifacts to POM and release related scripts. Add missing license headers 2019-09-13 08:41:29 -07:00
yanghua
895d732a14 refactor code 2019-09-12 05:15:07 -07:00
yanghua
5f04241fce refactor code: add docs and init/cleanup resource group for hoodie client test base 2019-09-12 05:15:07 -07:00
yanghua
80c27f2351 Optimize hoodie client after implementat auto closeable interface 2019-09-12 05:15:07 -07:00
yanghua
90bfb900aa revert setting jsc spark configuration 2019-09-12 05:15:07 -07:00
yanghua
6f2b166005 [HUDI-217] Provide a unified resource management class to standardize the resource allocation and release for hudi client test cases 2019-09-12 05:15:07 -07:00
Bhavani Sudha Saktheeswaran
64df98fc4a [HUDI-164] Fixes incorrect averageBytesPerRecord
When number of records written is zero, averageBytesPerRecord results in a huge size (division by zero and ceiled to Long.MAX_VALUE) causing OOM. This commit fixes this issue by reverse traversing the commits until a more reasonable average record size can be computed and if that is not possible returns the default configured record size.
2019-09-11 15:20:25 -07:00
Balaji Varadarajan
93bc5e2153 HUDI-243 Rename HoodieInputFormat and HoodieRealtimeInputFormat to HoodieParquetInputFormat and HoodieParquetRealtimeInputFormat 2019-09-11 14:03:01 -07:00
vinoth chandar
7a973a6944 [HUDI-159] Redesigning bundles for lighter-weight integrations
- Documented principles applied for redesign at packaging/README.md
 - No longer depends on incl commons-codec, commons-io, commons-pool, commons-dbcp, commons-lang, commons-logging, avro-mapred
 - Introduce new FileIOUtils & added checkstyle rule for illegal import of above
 - Parquet, Avro dependencies moved to provided scope to enable being picked up from Hive/Spark/Presto instead
 - Pickup jackson jars for Hive sync tool from HIVE_HOME & unbundling jackson everywhere
 - Remove hive-jdbc standalone jar from being bundled in Spark/Hive/Utilities bundles
 - 6.5x reduced number of classes across bundles
2019-09-11 11:08:27 -07:00
Balaji Varadarajan
a6908ef44d HUDI-170 Updating hoodie record before inserting it into ExternalSpillableMap (#866) 2019-08-30 09:03:37 -07:00
Balaji Varadarajan
5f9fa82f47 HUDI-124 : Exclude jdk.tools from hadoop-common and update Notice files (#858) 2019-08-28 16:20:47 -07:00
vinoth chandar
cd090871a1 [HUDI-159]: Pom cleanup and removal of com.twitter.parquet
- Redo all classes based on org.parquet only
 - remove unuused dependencies like parquet-hadoop, common-configuration2
 - timeline-service does not build a fat jar anymore
 - Fix utilities and hadoop-mr bundles based on above
2019-08-25 16:01:14 -07:00
leesf
1b79ef7672 HUDI-212: Specify Charset to UTF-8 for IOUtils.toString (#837) 2019-08-16 08:27:19 -07:00
Balaji Varadarajan
4787076c6d HUDI-204 : Make MOR rollback idempotent and disable using rolling stats for small file selection (#833) 2019-08-13 17:13:30 -07:00
Balaji Varadarajan
a4f9d7575f HUDI-123 Rename code packages/constants to org.apache.hudi (#830)
- Rename com.uber.hoodie to org.apache.hudi
- Flag to pass com.uber.hoodie Input formats for hoodie-sync
- Works with HUDI demo. 
- Also tested for backwards compatibility with datasets built by com.uber.hoodie packages
- Migration guide : https://cwiki.apache.org/confluence/display/HUDI/Migration+Guide+From+com.uber.hoodie+to+org.apache.hudi
2019-08-11 17:48:17 -07:00