From 36c7db72bc172961b66cfa0b9741ac860cc03bb4 Mon Sep 17 00:00:00 2001 From: Matei Zaharia Date: Sat, 17 Mar 2012 13:49:55 -0700 Subject: Documentation --- README.md | 12 ++++++++++++ 1 file changed, 12 insertions(+) (limited to 'README.md') diff --git a/README.md b/README.md index 5f9cd26df3..cde7b8a440 100644 --- a/README.md +++ b/README.md @@ -37,6 +37,15 @@ to connect to. This can be a Mesos URL, or "local" to run locally with one thread, or "local[N]" to run locally with N threads. +## A Note About Hadoop + +Spark uses the Hadoop core library to talk to HDFS and other Hadoop-supported +storage systems. Because the HDFS API has changed in different versions of +Hadoop, you must build Spark against the same version that your cluster runs. +You can change the version by setting the `HADOOP_VERSION` variable at the top +of `project/SparkBuild.scala`, then rebuilding Spark. + + ## Configuration Spark can be configured through two files: `conf/java-opts` and @@ -58,5 +67,8 @@ several Spark-specific variables you can set: - `SPARK_JAVA_OPTS`: Extra options to pass to JVM. +- `MESOS_NATIVE_LIBRARY`: Your Mesos library, if you want to run on a Mesos + cluster. For example, this might be /usr/local/lib/libmesos.so on Linux. + Note that `spark-env.sh` must be a shell script (it must be executable and start with a `#!` header to specify the shell to use). -- cgit v1.2.3