diff options
author | Andrew Or <andrewor14@gmail.com> | 2014-05-12 19:44:14 -0700 |
---|---|---|
committer | Patrick Wendell <pwendell@gmail.com> | 2014-05-12 19:44:32 -0700 |
commit | b9e41f4b8b52754ea059c3334da10dbfb4f41c17 (patch) | |
tree | ac6f5ef4546183079840a2c47d380f6e8cc70e59 /docs/index.md | |
parent | 5ef24a0c561d9bc58d1a35e471c38892fc6d3dff (diff) | |
download | spark-b9e41f4b8b52754ea059c3334da10dbfb4f41c17.tar.gz spark-b9e41f4b8b52754ea059c3334da10dbfb4f41c17.tar.bz2 spark-b9e41f4b8b52754ea059c3334da10dbfb4f41c17.zip |
[SPARK-1753 / 1773 / 1814] Update outdated docs for spark-submit, YARN, standalone etc.
YARN
- SparkPi was updated to not take in master as an argument; we should update the docs to reflect that.
- The default YARN build guide should be in maven, not sbt.
- This PR also adds a paragraph on steps to debug a YARN application.
Standalone
- Emphasize spark-submit more. Right now it's one small paragraph preceding the legacy way of launching through `org.apache.spark.deploy.Client`.
- The way we set configurations / environment variables according to the old docs is outdated. This needs to reflect changes introduced by the Spark configuration changes we made.
In general, this PR also adds a little more documentation on the new spark-shell, spark-submit, spark-defaults.conf etc here and there.
Author: Andrew Or <andrewor14@gmail.com>
Closes #701 from andrewor14/yarn-docs and squashes the following commits:
e2c2312 [Andrew Or] Merge in changes in #752 (SPARK-1814)
25cfe7b [Andrew Or] Merge in the warning from SPARK-1753
a8c39c5 [Andrew Or] Minor changes
336bbd9 [Andrew Or] Tabs -> spaces
4d9d8f7 [Andrew Or] Merge branch 'master' of github.com:apache/spark into yarn-docs
041017a [Andrew Or] Abstract Spark submit documentation to cluster-overview.html
3cc0649 [Andrew Or] Detail how to set configurations + remove legacy instructions
5b7140a [Andrew Or] Merge branch 'master' of github.com:apache/spark into yarn-docs
85a51fc [Andrew Or] Update run-example, spark-shell, configuration etc.
c10e8c7 [Andrew Or] Merge branch 'master' of github.com:apache/spark into yarn-docs
381fe32 [Andrew Or] Update docs for standalone mode
757c184 [Andrew Or] Add a note about the requirements for the debugging trick
f8ca990 [Andrew Or] Merge branch 'master' of github.com:apache/spark into yarn-docs
924f04c [Andrew Or] Revert addition of --deploy-mode
d5fe17b [Andrew Or] Update the YARN docs
(cherry picked from commit 2ffd1eafd28635dcecc0ac738d4a62c05d740925)
Signed-off-by: Patrick Wendell <pwendell@gmail.com>
Diffstat (limited to 'docs/index.md')
-rw-r--r-- | docs/index.md | 34 |
1 files changed, 22 insertions, 12 deletions
diff --git a/docs/index.md b/docs/index.md index a2f1a84371..48182a27d2 100644 --- a/docs/index.md +++ b/docs/index.md @@ -24,21 +24,31 @@ right version of Scala from [scala-lang.org](http://www.scala-lang.org/download/ # Running the Examples and Shell -Spark comes with several sample programs. Scala, Java and Python examples are in the `examples/src/main` directory. -To run one of the Java or Scala sample programs, use `./bin/run-example <class> <params>` in the top-level Spark directory -(the `bin/run-example` script sets up the appropriate paths and launches that program). -For example, try `./bin/run-example org.apache.spark.examples.SparkPi local`. -To run a Python sample program, use `./bin/pyspark <sample-program> <params>`. For example, try `./bin/pyspark ./examples/src/main/python/pi.py local`. +Spark comes with several sample programs. Scala, Java and Python examples are in the +`examples/src/main` directory. To run one of the Java or Scala sample programs, use +`bin/run-example <class> [params]` in the top-level Spark directory. (Behind the scenes, this +invokes the more general +[Spark submit script](cluster-overview.html#launching-applications-with-spark-submit) for +launching applications). For example, -Each example prints usage help when run with no parameters. + ./bin/run-example SparkPi 10 -Note that all of the sample programs take a `<master>` parameter specifying the cluster URL -to connect to. This can be a [URL for a distributed cluster](scala-programming-guide.html#master-urls), -or `local` to run locally with one thread, or `local[N]` to run locally with N threads. You should start by using -`local` for testing. +You can also run Spark interactively through modified versions of the Scala shell. This is a +great way to learn the framework. -Finally, you can run Spark interactively through modified versions of the Scala shell (`./bin/spark-shell`) or -Python interpreter (`./bin/pyspark`). These are a great way to learn the framework. + ./bin/spark-shell --master local[2] + +The `--master` option specifies the +[master URL for a distributed cluster](scala-programming-guide.html#master-urls), or `local` to run +locally with one thread, or `local[N]` to run locally with N threads. You should start by using +`local` for testing. For a full list of options, run Spark shell with the `--help` option. + +Spark also provides a Python interface. To run an example Spark application written in Python, use +`bin/pyspark <program> [params]`. For example, + + ./bin/pyspark examples/src/main/python/pi.py local[2] 10 + +or simply `bin/pyspark` without any arguments to run Spark interactively in a python interpreter. # Launching on a Cluster |