SPARK-1314: Use SPARK_HIVE to determine if we include Hive in packaging

Previously, we based our decision regarding including datanucleus jars based on the existence of a spark-hive-assembly jar, which was incidentally built whenever "sbt assembly" is run. This means that a typical and previously supported pathway would start using hive jars. This patch has the following features/bug fixes: - Use of SPARK_HIVE (default false) to determine if we should include Hive in the assembly jar. - Analagous feature in Maven with -Phive (previously, there was no support for adding Hive to any of our jars produced by Maven) - assemble-deps fixed since we no longer use a different ASSEMBLY_DIR - avoid adding log message in compute-classpath.sh to the classpath :) Still TODO before mergeable: - We need to download the datanucleus jars outside of sbt. Perhaps we can have spark-class download them if SPARK_HIVE is set similar to how sbt downloads itself. - Spark SQL documentation updates. Author: Aaron Davidson <aaron@databricks.com> Closes #237 from aarondav/master and squashes the following commits: 5dc4329 [Aaron Davidson] Typo fixes dd4f298 [Aaron Davidson] Doc update dd1a365 [Aaron Davidson] Eliminate need for SPARK_HIVE at runtime by d/ling datanucleus from Maven a9269b5 [Aaron Davidson] [WIP] Use SPARK_HIVE to determine if we include Hive in packaging
author: Aaron Davidson <aaron@databricks.com> 2014-04-06 17:48:41 -0700
committer: Patrick Wendell <pwendell@gmail.com> 2014-04-06 17:48:41 -0700
commit: 4106558435889261243d186f5f0b51c5f9e98d56 (patch)
tree: 6735046be9dbc5048867a619a951c39d884f3d1f /docs/sql-programming-guide.md
parent: 7ce52c4a7a07b0db5e7c1312b1920efb1165ce6a (diff)
download: spark-4106558435889261243d186f5f0b51c5f9e98d56.tar.gz
spark-4106558435889261243d186f5f0b51c5f9e98d56.tar.bz2
spark-4106558435889261243d186f5f0b51c5f9e98d56.zip
1 files changed, 2 insertions, 2 deletions
diff --git a/docs/sql-programming-guide.md b/docs/sql-programming-guide.md
index f849716f7a..a59393e142 100644
--- a/docs/sql-programming-guide.md
+++ b/docs/sql-programming-guide.md
@@ -264,8 +264,8 @@ evaluated by the SQL execution engine.  A full list of the functions supported c
 
 Spark SQL also supports reading and writing data stored in [Apache Hive](http://hive.apache.org/).
 However, since Hive has a large number of dependencies, it is not included in the default Spark assembly.
-In order to use Hive you must first run '`SPARK_HIVE=true sbt/sbt assembly/assembly`'.  This command builds a new assembly
-jar that includes Hive. Note that this Hive assembly jar must also be present
+In order to use Hive you must first run '`SPARK_HIVE=true sbt/sbt assembly/assembly`' (or use `-Phive` for maven).
+This command builds a new assembly jar that includes Hive. Note that this Hive assembly jar must also be present
 on all of the worker nodes, as they will need access to the Hive serialization and deserialization libraries
 (SerDes) in order to acccess data stored in Hive.
author	Aaron Davidson <aaron@databricks.com>	2014-04-06 17:48:41 -0700
committer	Patrick Wendell <pwendell@gmail.com>	2014-04-06 17:48:41 -0700
commit	4106558435889261243d186f5f0b51c5f9e98d56 (patch)
tree	6735046be9dbc5048867a619a951c39d884f3d1f /docs/sql-programming-guide.md
parent	7ce52c4a7a07b0db5e7c1312b1920efb1165ce6a (diff)
download	spark-4106558435889261243d186f5f0b51c5f9e98d56.tar.gz spark-4106558435889261243d186f5f0b51c5f9e98d56.tar.bz2 spark-4106558435889261243d186f5f0b51c5f9e98d56.zip