[SPARK-1753 / 1773 / 1814] Update outdated docs for spark-submit, YARN, standalone etc.

YARN - SparkPi was updated to not take in master as an argument; we should update the docs to reflect that. - The default YARN build guide should be in maven, not sbt. - This PR also adds a paragraph on steps to debug a YARN application. Standalone - Emphasize spark-submit more. Right now it's one small paragraph preceding the legacy way of launching through `org.apache.spark.deploy.Client`. - The way we set configurations / environment variables according to the old docs is outdated. This needs to reflect changes introduced by the Spark configuration changes we made. In general, this PR also adds a little more documentation on the new spark-shell, spark-submit, spark-defaults.conf etc here and there. Author: Andrew Or <andrewor14@gmail.com> Closes #701 from andrewor14/yarn-docs and squashes the following commits: e2c2312 [Andrew Or] Merge in changes in #752 (SPARK-1814) 25cfe7b [Andrew Or] Merge in the warning from SPARK-1753 a8c39c5 [Andrew Or] Minor changes 336bbd9 [Andrew Or] Tabs -> spaces 4d9d8f7 [Andrew Or] Merge branch 'master' of github.com:apache/spark into yarn-docs 041017a [Andrew Or] Abstract Spark submit documentation to cluster-overview.html 3cc0649 [Andrew Or] Detail how to set configurations + remove legacy instructions 5b7140a [Andrew Or] Merge branch 'master' of github.com:apache/spark into yarn-docs 85a51fc [Andrew Or] Update run-example, spark-shell, configuration etc. c10e8c7 [Andrew Or] Merge branch 'master' of github.com:apache/spark into yarn-docs 381fe32 [Andrew Or] Update docs for standalone mode 757c184 [Andrew Or] Add a note about the requirements for the debugging trick f8ca990 [Andrew Or] Merge branch 'master' of github.com:apache/spark into yarn-docs 924f04c [Andrew Or] Revert addition of --deploy-mode d5fe17b [Andrew Or] Update the YARN docs (cherry picked from commit 2ffd1eafd28635dcecc0ac738d4a62c05d740925) Signed-off-by: Patrick Wendell <pwendell@gmail.com>
author: Andrew Or <andrewor14@gmail.com> 2014-05-12 19:44:14 -0700
committer: Patrick Wendell <pwendell@gmail.com> 2014-05-12 19:44:32 -0700
commit: b9e41f4b8b52754ea059c3334da10dbfb4f41c17 (patch)
tree: ac6f5ef4546183079840a2c47d380f6e8cc70e59 /docs/index.md
parent: 5ef24a0c561d9bc58d1a35e471c38892fc6d3dff (diff)
download: spark-b9e41f4b8b52754ea059c3334da10dbfb4f41c17.tar.gz
spark-b9e41f4b8b52754ea059c3334da10dbfb4f41c17.tar.bz2
spark-b9e41f4b8b52754ea059c3334da10dbfb4f41c17.zip
1 files changed, 22 insertions, 12 deletions
diff --git a/docs/index.md b/docs/index.md
index a2f1a84371..48182a27d2 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -24,21 +24,31 @@ right version of Scala from [scala-lang.org](http://www.scala-lang.org/download/
 
 # Running the Examples and Shell
 
-Spark comes with several sample programs.  Scala, Java and Python examples are in the `examples/src/main` directory.
-To run one of the Java or Scala sample programs, use `./bin/run-example <class> <params>` in the top-level Spark directory
-(the `bin/run-example` script sets up the appropriate paths and launches that program).
-For example, try `./bin/run-example org.apache.spark.examples.SparkPi local`.
-To run a Python sample program, use `./bin/pyspark <sample-program> <params>`.  For example, try `./bin/pyspark ./examples/src/main/python/pi.py local`.
+Spark comes with several sample programs.  Scala, Java and Python examples are in the
+`examples/src/main` directory. To run one of the Java or Scala sample programs, use
+`bin/run-example <class> [params]` in the top-level Spark directory. (Behind the scenes, this
+invokes the more general
+[Spark submit script](cluster-overview.html#launching-applications-with-spark-submit) for
+launching applications). For example,
 
-Each example prints usage help when run with no parameters.
+    ./bin/run-example SparkPi 10
 
-Note that all of the sample programs take a `<master>` parameter specifying the cluster URL
-to connect to. This can be a [URL for a distributed cluster](scala-programming-guide.html#master-urls),
-or `local` to run locally with one thread, or `local[N]` to run locally with N threads. You should start by using
-`local` for testing.
+You can also run Spark interactively through modified versions of the Scala shell. This is a
+great way to learn the framework.
 
-Finally, you can run Spark interactively through modified versions of the Scala shell (`./bin/spark-shell`) or
-Python interpreter (`./bin/pyspark`). These are a great way to learn the framework.
+    ./bin/spark-shell --master local[2]
+
+The `--master` option specifies the
+[master URL for a distributed cluster](scala-programming-guide.html#master-urls), or `local` to run
+locally with one thread, or `local[N]` to run locally with N threads. You should start by using
+`local` for testing. For a full list of options, run Spark shell with the `--help` option.
+
+Spark also provides a Python interface. To run an example Spark application written in Python, use
+`bin/pyspark <program> [params]`. For example,
+
+    ./bin/pyspark examples/src/main/python/pi.py local[2] 10
+
+or simply `bin/pyspark` without any arguments to run Spark interactively in a python interpreter.
 
 # Launching on a Cluster
author	Andrew Or <andrewor14@gmail.com>	2014-05-12 19:44:14 -0700
committer	Patrick Wendell <pwendell@gmail.com>	2014-05-12 19:44:32 -0700
commit	b9e41f4b8b52754ea059c3334da10dbfb4f41c17 (patch)
tree	ac6f5ef4546183079840a2c47d380f6e8cc70e59 /docs/index.md
parent	5ef24a0c561d9bc58d1a35e471c38892fc6d3dff (diff)
download	spark-b9e41f4b8b52754ea059c3334da10dbfb4f41c17.tar.gz spark-b9e41f4b8b52754ea059c3334da10dbfb4f41c17.tar.bz2 spark-b9e41f4b8b52754ea059c3334da10dbfb4f41c17.zip