Update docs about HDFS versions

author: Matei Zaharia <matei@eecs.berkeley.edu> 2013-08-30 15:04:43 -0700
committer: Matei Zaharia <matei@eecs.berkeley.edu> 2013-08-30 15:04:43 -0700
commit: 4293533032bd5c354bb011f8d508b99615c6e0f0 (patch)
tree: e82fd2cc72c90ed98f5b0f1f4a74593cf3e6c54b /docs/scala-programming-guide.md
parent: f3a964848dd2ba65491f3eea8a54439069aa1b29 (diff)
download: spark-4293533032bd5c354bb011f8d508b99615c6e0f0.tar.gz
spark-4293533032bd5c354bb011f8d508b99615c6e0f0.tar.bz2
spark-4293533032bd5c354bb011f8d508b99615c6e0f0.zip
1 files changed, 11 insertions, 3 deletions
diff --git a/docs/scala-programming-guide.md b/docs/scala-programming-guide.md
index db584d2096..e321b8f5b8 100644
--- a/docs/scala-programming-guide.md
+++ b/docs/scala-programming-guide.md
@@ -17,15 +17,23 @@ This guide shows each of these features and walks through some samples. It assum
 
 # Linking with Spark
 
-To write a Spark application, you will need to add both Spark and its dependencies to your CLASSPATH. If you use sbt or Maven, Spark is available through Maven Central at:
+Spark {{site.SPARK_VERSION}} uses Scala {{site.SCALA_VERSION}}. If you write applications in Scala, you'll need to use this same version of Scala in your program -- newer major versions may not work.
+
+To write a Spark application, you need to add a dependency on Spark. If you use SBT or Maven, Spark is available through Maven Central at:
 
     groupId = org.spark-project
     artifactId = spark-core_{{site.SCALA_VERSION}}
     version = {{site.SPARK_VERSION}} 
 
-For other build systems or environments, you can run `sbt/sbt assembly` to build both Spark and its dependencies into one JAR (`core/target/spark-core-assembly-0.6.0.jar`), then add this to your CLASSPATH.
+In addition, if you wish to access an HDFS cluster, you need to add a dependency on `hadoop-client` for your version of HDFS:
+
+    groupId = org.apache.hadoop
+    artifactId = hadoop-client
+    version = <your-hdfs-version>
+
+For other build systems, you can run `sbt/sbt assembly` to pack Spark and its dependencies into one JAR (`assembly/target/scala-{{site.SCALA_VERSION}}/spark-assembly-{{site.SPARK_VERSION}}-hadoop*.jar`), then add this to your CLASSPATH. Set the HDFS version as described [here](index.html#a-note-about-hadoop-versions).
 
-In addition, you'll need to import some Spark classes and implicit conversions. Add the following lines at the top of your program:
+Finally, you need to import some Spark classes and implicit conversions into your program. Add the following lines:
 
 {% highlight scala %}
 import spark.SparkContext
author	Matei Zaharia <matei@eecs.berkeley.edu>	2013-08-30 15:04:43 -0700
committer	Matei Zaharia <matei@eecs.berkeley.edu>	2013-08-30 15:04:43 -0700
commit	4293533032bd5c354bb011f8d508b99615c6e0f0 (patch)
tree	e82fd2cc72c90ed98f5b0f1f4a74593cf3e6c54b /docs/scala-programming-guide.md
parent	f3a964848dd2ba65491f3eea8a54439069aa1b29 (diff)
download	spark-4293533032bd5c354bb011f8d508b99615c6e0f0.tar.gz spark-4293533032bd5c354bb011f8d508b99615c6e0f0.tar.bz2 spark-4293533032bd5c354bb011f8d508b99615c6e0f0.zip