From 52ae952574f5d641a398dd185e09e5a79318c8a9 Mon Sep 17 00:00:00 2001 From: Cheng Lian Date: Mon, 17 Aug 2015 17:25:14 -0700 Subject: [SPARK-9974] [BUILD] [SQL] Makes sure com.twitter:parquet-hadoop-bundle:1.6.0 is in SBT assembly jar PR #7967 enables Spark SQL to persist Parquet tables in Hive compatible format when possible. One of the consequence is that, we have to set input/output classes to `MapredParquetInputFormat`/`MapredParquetOutputFormat`, which rely on com.twitter:parquet-hadoop:1.6.0 bundled with Hive 1.2.1. When loading such a table in Spark SQL, `o.a.h.h.ql.metadata.Table` first loads these input/output format classes, and thus classes in com.twitter:parquet-hadoop:1.6.0. However, the scope of this dependency is defined as "runtime", and is not packaged into Spark assembly jar. This results in a `ClassNotFoundException`. This issue can be worked around by asking users to add parquet-hadoop 1.6.0 via the `--driver-class-path` option. However, considering Maven build is immune to this problem, I feel it can be confusing and inconvenient for users. So this PR fixes this issue by changing scope of parquet-hadoop 1.6.0 to "compile". Author: Cheng Lian Closes #8198 from liancheng/spark-9974/bundle-parquet-1.6.0. --- pom.xml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'pom.xml') diff --git a/pom.xml b/pom.xml index cfd7d32563..9bfca1c417 100644 --- a/pom.xml +++ b/pom.xml @@ -1598,7 +1598,7 @@ com.twitter parquet-hadoop-bundle ${hive.parquet.version} - runtime + compile org.apache.flume -- cgit v1.2.3