aboutsummaryrefslogtreecommitdiff
path: root/README.md
diff options
context:
space:
mode:
authorOlivier Grisel <olivier.grisel@ensta.org>2011-06-23 02:24:04 +0200
committerOlivier Grisel <olivier.grisel@ensta.org>2011-06-23 02:24:04 +0200
commit236bcd0d9b41987b771500b0643e1b73a87a3ca4 (patch)
tree3625a76249a15037778bf1a0664f1aa86a6c6372 /README.md
parent214250016a789aa1a7da9a1ccaa4efc9c5587926 (diff)
downloadspark-236bcd0d9b41987b771500b0643e1b73a87a3ca4.tar.gz
spark-236bcd0d9b41987b771500b0643e1b73a87a3ca4.tar.bz2
spark-236bcd0d9b41987b771500b0643e1b73a87a3ca4.zip
Markdown rendering for the toplevel README.md to improve readability on github
Diffstat (limited to 'README.md')
-rw-r--r--README.md59
1 files changed, 59 insertions, 0 deletions
diff --git a/README.md b/README.md
new file mode 100644
index 0000000000..1cef2365b1
--- /dev/null
+++ b/README.md
@@ -0,0 +1,59 @@
+# Spark
+
+Lightning-Fast Cluster Computing - <http://www.spark-project.org/>
+
+
+## Online Documentation
+
+You can find the latest Spark documentation, including a programming
+guide, on the project wiki at <http://github.com/mesos/spark/wiki>. This
+file only contains basic setup instructions.
+
+
+## Building
+
+Spark requires Scala 2.8. This version has been tested with 2.8.1.final.
+Experimental support for Scala 2.9 is available in the `scala-2.9` branch.
+
+The project is built using Simple Build Tool (SBT), which is packaged with it.
+To build Spark and its example programs, run:
+
+ sbt/sbt update compile
+
+To run Spark, you will need to have Scala's bin in your $PATH, or you
+will need to set the `SCALA_HOME` environment variable to point to where
+you've installed Scala. Scala must be accessible through one of these
+methods on Mesos slave nodes as well as on the master.
+
+To run one of the examples, use `./run <class> <params>`. For example:
+
+ ./run spark.examples.SparkLR local[2]
+
+will run the Logistic Regression example locally on 2 CPUs.
+
+Each of the example programs prints usage help if no params are given.
+
+All of the Spark samples take a `<host>` parameter that is the Mesos master
+to connect to. This can be a Mesos URL, or "local" to run locally with one
+thread, or "local[N]" to run locally with N threads.
+
+
+## Configuration
+
+Spark can be configured through two files: `conf/java-opts` and
+`conf/spark-env.sh`.
+
+In `java-opts`, you can add flags to be passed to the JVM when running Spark.
+
+In `spark-env.sh`, you can set any environment variables you wish to be available
+when running Spark programs, such as `PATH`, `SCALA_HOME`, etc. There are also
+several Spark-specific variables you can set:
+- `SPARK_CLASSPATH`: Extra entries to be added to the classpath, separated by ":".
+- `SPARK_MEM`: Memory for Spark to use, in the format used by java's `-Xmx`
+ option (for example, `-Xmx200m` meams 200 MB, `-Xmx1g` means 1 GB, etc).
+- `SPARK_LIBRARY_PATH`: Extra entries to add to `java.library.path` for locating
+ shared libraries.
+- `SPARK_JAVA_OPTS`: Extra options to pass to JVM.
+
+Note that `spark-env.sh` must be a shell script (it must be executable and start
+with a `#!` header to specify the shell to use).