aboutsummaryrefslogtreecommitdiff
path: root/docs/running-on-yarn.md
Commit message (Collapse)AuthorAgeFilesLines
...
* Fix unclosed HTML tag in Yarn docs.Josh Rosen2014-08-261-1/+1
|
* SPARK-1680: use configs for specifying environment variables on YARNThomas Graves2014-08-051-5/+17
| | | | | | | | | | | | | | Note that this also documents spark.executorEnv.* which to me means its public. If we don't want that please speak up. Author: Thomas Graves <tgraves@apache.org> Closes #1512 from tgravescs/SPARK-1680 and squashes the following commits: 11525df [Thomas Graves] more doc changes 553bad0 [Thomas Graves] fix documentation 152bf7c [Thomas Graves] fix docs 5382326 [Thomas Graves] try fix docs 32f86a4 [Thomas Graves] use configs for specifying environment variables on YARN
* SPARK-1528 - spark on yarn, add support for accessing remote HDFSThomas Graves2014-08-051-0/+7
| | | | | | | | | | | Add a config (spark.yarn.access.namenodes) to allow applications running on yarn to access other secure HDFS cluster. User just specifies the namenodes of the other clusters and we get Tokens for those and ship them with the spark application. Author: Thomas Graves <tgraves@apache.org> Closes #1159 from tgravescs/spark-1528 and squashes the following commits: ddbcd16 [Thomas Graves] review comments 0ac8501 [Thomas Graves] SPARK-1528 - add support for accessing remote HDFS
* SPARK-2400 : fix spark.yarn.max.executor.failures explainationCrazyJvm2014-07-081-1/+1
| | | | | | | | | | | | | | | | | | | | | According to ```scala private val maxNumExecutorFailures = sparkConf.getInt("spark.yarn.max.executor.failures", sparkConf.getInt("spark.yarn.max.worker.failures", math.max(args.numExecutors * 2, 3))) ``` default value should be numExecutors * 2, with minimum of 3, and it's same to the config `spark.yarn.max.worker.failures` Author: CrazyJvm <crazyjvm@gmail.com> Closes #1282 from CrazyJvm/yarn-doc and squashes the following commits: 1a5f25b [CrazyJvm] remove deprecated config c438aec [CrazyJvm] fix style 86effa6 [CrazyJvm] change expression 211f130 [CrazyJvm] fix html tag 2900d23 [CrazyJvm] fix style a4b2e27 [CrazyJvm] fix configuration spark.yarn.max.executor.failures
* Fixed small running on YARN docs typoVlad2014-06-231-1/+1
| | | | | | | | | | The backslash is needed for multiline command Author: Vlad <frolvlad@gmail.com> Closes #1158 from frol/patch-1 and squashes the following commits: e258044 [Vlad] Fixed small running on YARN docs typo
* [SPARK-1395] Fix "local:" URI support in Yarn mode (again).Marcelo Vanzin2014-06-231-3/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Recent changes ignored the fact that path may be defined with "local:" URIs, which means they need to be explicitly added to the classpath everywhere a remote process is started. This change fixes that by: - Using the correct methods to add paths to the classpath - Creating SparkConf settings for the Spark jar itself and for the user's jar - Propagating those two settings to the remote processes where needed This ensures that both in client and in cluster mode, the driver has the necessary info to build the executor's classpath and have things still work when they contain "local:" references. The change also fixes some confusion in ClientBase about whether to use SparkConf or system properties to propagate config options to the driver and executors, by standardizing on using data held by SparkConf. On the cleanup front, I removed the hacky way that log4j configuration was being propagated to handle the "local:" case. It's much more cleanly (and generically) handled by using spark-submit arguments (--files to upload a config file, or setting spark.executor.extraJavaOptions to pass JVM arguments and use a local file). Author: Marcelo Vanzin <vanzin@cloudera.com> Closes #560 from vanzin/yarn-local-2 and squashes the following commits: 4e7f066 [Marcelo Vanzin] Correctly propagate SPARK_JAVA_OPTS to driver/executor. 6a454ea [Marcelo Vanzin] Use constants for PWD in test. 6dd5943 [Marcelo Vanzin] Fix propagation of config options to driver / executor. b2e377f [Marcelo Vanzin] Review feedback. 93c3f85 [Marcelo Vanzin] Fix ClassCastException in test. e5c682d [Marcelo Vanzin] Fix cluster mode, restore SPARK_LOG4J_CONF. 1dfbb40 [Marcelo Vanzin] Add documentation for spark.yarn.jar. bbdce05 [Marcelo Vanzin] [SPARK-1395] Fix "local:" URI support in Yarn mode (again).
* [SPARK-2051]In yarn.ClientBase spark.yarn.dist.* do not workwitgo2014-06-191-3/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | Author: witgo <witgo@qq.com> Closes #969 from witgo/yarn_ClientBase and squashes the following commits: 8117765 [witgo] review commit 3bdbc52 [witgo] Merge branch 'master' of https://github.com/apache/spark into yarn_ClientBase 5261b6c [witgo] fix sys.props.get("SPARK_YARN_DIST_FILES") e3c1107 [witgo] update docs b6a9aa1 [witgo] merge master c8b4554 [witgo] review commit 2f48789 [witgo] Merge branch 'master' of https://github.com/apache/spark into yarn_ClientBase 8d7b82f [witgo] Merge branch 'master' of https://github.com/apache/spark into yarn_ClientBase 1048549 [witgo] remove Utils.resolveURIs 871f1db [witgo] add spark.yarn.dist.* documentation 41bce59 [witgo] review commit 35d6fa0 [witgo] move to ClientArguments 55d72fc [witgo] Merge branch 'master' of https://github.com/apache/spark into yarn_ClientBase 9cdff16 [witgo] review commit 8bc2f4b [witgo] review commit 20e667c [witgo] Merge branch 'master' into yarn_ClientBase 0961151 [witgo] merge master ce609fc [witgo] Merge branch 'master' into yarn_ClientBase 8362489 [witgo] yarn.ClientBase spark.yarn.dist.* do not work
* [SPARK-1930] The Container is running beyond physical memory limits, so as ↵witgo2014-06-161-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | to be killed Author: witgo <witgo@qq.com> Closes #894 from witgo/SPARK-1930 and squashes the following commits: 564307e [witgo] Update the running-on-yarn.md 3747515 [witgo] Merge branch 'master' of https://github.com/apache/spark into SPARK-1930 172647b [witgo] add memoryOverhead docs a0ff545 [witgo] leaving only two configs a17bda2 [witgo] Merge branch 'master' of https://github.com/apache/spark into SPARK-1930 478ca15 [witgo] Merge branch 'master' into SPARK-1930 d1244a1 [witgo] Merge branch 'master' into SPARK-1930 8b967ae [witgo] Merge branch 'master' into SPARK-1930 655a820 [witgo] review commit 71859a7 [witgo] Merge branch 'master' of https://github.com/apache/spark into SPARK-1930 e3c531d [witgo] review commit e16f190 [witgo] different memoryOverhead ffa7569 [witgo] review commit 5c9581f [witgo] Merge branch 'master' into SPARK-1930 9a6bcf2 [witgo] review commit 8fae45a [witgo] fix NullPointerException e0dcc16 [witgo] Adding configuration items b6a989c [witgo] Fix container memory beyond limit, were killed
* [SPARK-1566] consolidate programming guide, and general doc updatesMatei Zaharia2014-05-301-26/+65
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a fairly large PR to clean up and update the docs for 1.0. The major changes are: * A unified programming guide for all languages replaces language-specific ones and shows language-specific info in tabs * New programming guide sections on key-value pairs, unit testing, input formats beyond text, migrating from 0.9, and passing functions to Spark * Spark-submit guide moved to a separate page and expanded slightly * Various cleanups of the menu system, security docs, and others * Updated look of title bar to differentiate the docs from previous Spark versions You can find the updated docs at http://people.apache.org/~matei/1.0-docs/_site/ and in particular http://people.apache.org/~matei/1.0-docs/_site/programming-guide.html. Author: Matei Zaharia <matei@databricks.com> Closes #896 from mateiz/1.0-docs and squashes the following commits: 03e6853 [Matei Zaharia] Some tweaks to configuration and YARN docs 0779508 [Matei Zaharia] tweak ef671d4 [Matei Zaharia] Keep frames in JavaDoc links, and other small tweaks 1bf4112 [Matei Zaharia] Review comments 4414f88 [Matei Zaharia] tweaks d04e979 [Matei Zaharia] Fix some old links to Java guide a34ed33 [Matei Zaharia] tweak 541bb3b [Matei Zaharia] miscellaneous changes fcefdec [Matei Zaharia] Moved submitting apps to separate doc 61d72b4 [Matei Zaharia] stuff 181f217 [Matei Zaharia] migration guide, remove old language guides e11a0da [Matei Zaharia] Add more API functions 6a030a9 [Matei Zaharia] tweaks 8db0ae3 [Matei Zaharia] Added key-value pairs section 318d2c9 [Matei Zaharia] tweaks 1c81477 [Matei Zaharia] New section on basics and function syntax e38f559 [Matei Zaharia] Actually added programming guide to Git a33d6fe [Matei Zaharia] First pass at updating programming guide to support all languages, plus other tweaks throughout 3b6a876 [Matei Zaharia] More CSS tweaks 01ec8bf [Matei Zaharia] More CSS tweaks e6d252e [Matei Zaharia] Change color of doc title bar to differentiate from 0.9.0
* [SPARK-1753 / 1773 / 1814] Update outdated docs for spark-submit, YARN, ↵Andrew Or2014-05-121-4/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | standalone etc. YARN - SparkPi was updated to not take in master as an argument; we should update the docs to reflect that. - The default YARN build guide should be in maven, not sbt. - This PR also adds a paragraph on steps to debug a YARN application. Standalone - Emphasize spark-submit more. Right now it's one small paragraph preceding the legacy way of launching through `org.apache.spark.deploy.Client`. - The way we set configurations / environment variables according to the old docs is outdated. This needs to reflect changes introduced by the Spark configuration changes we made. In general, this PR also adds a little more documentation on the new spark-shell, spark-submit, spark-defaults.conf etc here and there. Author: Andrew Or <andrewor14@gmail.com> Closes #701 from andrewor14/yarn-docs and squashes the following commits: e2c2312 [Andrew Or] Merge in changes in #752 (SPARK-1814) 25cfe7b [Andrew Or] Merge in the warning from SPARK-1753 a8c39c5 [Andrew Or] Minor changes 336bbd9 [Andrew Or] Tabs -> spaces 4d9d8f7 [Andrew Or] Merge branch 'master' of github.com:apache/spark into yarn-docs 041017a [Andrew Or] Abstract Spark submit documentation to cluster-overview.html 3cc0649 [Andrew Or] Detail how to set configurations + remove legacy instructions 5b7140a [Andrew Or] Merge branch 'master' of github.com:apache/spark into yarn-docs 85a51fc [Andrew Or] Update run-example, spark-shell, configuration etc. c10e8c7 [Andrew Or] Merge branch 'master' of github.com:apache/spark into yarn-docs 381fe32 [Andrew Or] Update docs for standalone mode 757c184 [Andrew Or] Add a note about the requirements for the debugging trick f8ca990 [Andrew Or] Merge branch 'master' of github.com:apache/spark into yarn-docs 924f04c [Andrew Or] Revert addition of --deploy-mode d5fe17b [Andrew Or] Update the YARN docs
* SPARK-1565 (Addendum): Replace `run-example` with `spark-submit`.Patrick Wendell2014-05-081-1/+1
| | | | | | | | | | | | | Gives a nicely formatted message to the user when `run-example` is run to tell them to use `spark-submit`. Author: Patrick Wendell <pwendell@gmail.com> Closes #704 from pwendell/examples and squashes the following commits: 1996ee8 [Patrick Wendell] Feedback form Andrew 3eb7803 [Patrick Wendell] Suggestions from TD 2474668 [Patrick Wendell] SPARK-1565 (Addendum): Replace `run-example` with `spark-submit`.
* SPARK-1492. Update Spark YARN docs to use spark-submitSandy Ryza2014-05-021-87/+30
| | | | | | | | | | Author: Sandy Ryza <sandy@cloudera.com> Closes #601 from sryza/sandy-spark-1492 and squashes the following commits: 5df1634 [Sandy Ryza] Address additional comments from Patrick. be46d1f [Sandy Ryza] Address feedback from Marcelo and Patrick 867a3ea [Sandy Ryza] SPARK-1492. Update Spark YARN docs to use spark-submit
* SPARK-1408 Modify Spark on Yarn to point to the history server when app ...Thomas Graves2014-04-171-0/+1
| | | | | | | | | | | | | | | | ...finishes Note this is dependent on https://github.com/apache/spark/pull/204 to have a working history server, but there are no code dependencies. This also fixes SPARK-1288 yarn stable finishApplicationMaster incomplete. Since I was in there I made the diagnostic message be passed properly. Author: Thomas Graves <tgraves@apache.org> Closes #362 from tgravescs/SPARK-1408 and squashes the following commits: ec89705 [Thomas Graves] Fix typo. 446122d [Thomas Graves] Make config yarn specific f5d5373 [Thomas Graves] SPARK-1408 Modify Spark on Yarn to point to the history server when app finishes
* SPARK-1376. In the yarn-cluster submitter, rename "args" option to "arg"Sandy Ryza2014-04-011-3/+4
| | | | | | | | Author: Sandy Ryza <sandy@cloudera.com> Closes #279 from sryza/sandy-spark-1376 and squashes the following commits: d8aebfa [Sandy Ryza] SPARK-1376. In the yarn-cluster submitter, rename "args" option to "arg"
* SPARK-1126. spark-app preliminarySandy Ryza2014-03-291-2/+4
| | | | | | | | | | | | | | | | | This is a starting version of the spark-app script for running compiled binaries against Spark. It still needs tests and some polish. The only testing I've done so far has been using it to launch jobs in yarn-standalone mode against a pseudo-distributed cluster. This leaves out the changes required for launching python scripts. I think it might be best to save those for another JIRA/PR (while keeping to the design so that they won't require backwards-incompatible changes). Author: Sandy Ryza <sandy@cloudera.com> Closes #86 from sryza/sandy-spark-1126 and squashes the following commits: d428d85 [Sandy Ryza] Commenting, doc, and import fixes from Patrick's comments e7315c6 [Sandy Ryza] Fix failing tests 34de899 [Sandy Ryza] Change --more-jars to --jars and fix docs 299ddca [Sandy Ryza] Fix scalastyle a94c627 [Sandy Ryza] Add newline at end of SparkSubmit 04bc4e2 [Sandy Ryza] SPARK-1126. spark-submit script
* SPARK-1183. Don't use "worker" to mean executorSandy Ryza2014-03-131-15/+14
| | | | | | | | | | | | Author: Sandy Ryza <sandy@cloudera.com> Closes #120 from sryza/sandy-spark-1183 and squashes the following commits: 5066a4a [Sandy Ryza] Remove "worker" in a couple comments 0bd1e46 [Sandy Ryza] Remove --am-class from usage bfc8fe0 [Sandy Ryza] Remove am-class from doc and fix yarn-alpha 607539f [Sandy Ryza] Address review comments 74d087a [Sandy Ryza] SPARK-1183. Don't use "worker" to mean executor
* SPARK-1197. Change yarn-standalone to yarn-cluster and fix up running on ↵Sandy Ryza2014-03-061-29/+36
| | | | | | | | | | | | | YARN docs This patch changes "yarn-standalone" to "yarn-cluster" (but still supports the former). It also cleans up the Running on YARN docs and adds a section on how to view logs. Author: Sandy Ryza <sandy@cloudera.com> Closes #95 from sryza/sandy-spark-1197 and squashes the following commits: 563ef3a [Sandy Ryza] Review feedback 6ad06d4 [Sandy Ryza] Change yarn-standalone to yarn-cluster and fix up running on YARN docs
* SPARK-1053. Don't require SPARK_YARN_APP_JARSandy Ryza2014-02-261-4/+2
| | | | | | | | | | | | It looks this just requires taking out the checks. I verified that, with the patch, I was able to run spark-shell through yarn without setting the environment variable. Author: Sandy Ryza <sandy@cloudera.com> Closes #553 from sryza/sandy-spark-1053 and squashes the following commits: b037676 [Sandy Ryza] SPARK-1053. Don't require SPARK_YARN_APP_JAR
* [SPARK-1105] fix site scala version error in docsCodingCat2014-02-191-8/+8
| | | | | | | | | | | | | https://spark-project.atlassian.net/browse/SPARK-1105 fix site scala version error Author: CodingCat <zhunansjtu@gmail.com> Closes #618 from CodingCat/doc_version and squashes the following commits: 39bb8aa [CodingCat] more fixes 65bedb0 [CodingCat] fix site scala version error in doc
* Incorporate Tom's comments - update doc and code to reflect that core ↵Sandy Ryza2014-01-211-1/+1
| | | | requests may not always be honored
* SPARK-1033. Ask for cores in Yarn container requestsSandy Ryza2014-01-201-1/+1
|
* yarn-client addJar fix and misc otherThomas Graves2014-01-091-2/+13
|
* Merge pull request #345 from colorant/yarnThomas Graves2014-01-081-0/+2
|\ | | | | | | | | | | support distributing extra files to worker for yarn client mode So that user doesn't need to package all dependency into one assemble jar as spark app jar
| * Export --file for YarnClient mode to support sending extra files to worker ↵Raymond Liu2014-01-071-0/+2
| | | | | | | | on yarn cluster
* | Code review feedbackHolden Karau2014-01-051-3/+3
|/
* Merge remote-tracking branch 'apache-github/master' into remove-binariesPatrick Wendell2014-01-031-7/+4
|\ | | | | | | | | | | Conflicts: core/src/test/scala/org/apache/spark/DriverSuite.scala docs/python-programming-guide.md
| * Merge pull request #317 from ScrapCodes/spark-915-segregate-scriptsPatrick Wendell2014-01-031-4/+4
| |\ | | | | | | | | | Spark-915 segregate scripts
| | * sbin/spark-class* -> bin/spark-class*Prashant Sharma2014-01-031-2/+2
| | |
| | * run-example -> bin/run-examplePrashant Sharma2014-01-021-1/+1
| | |
| | * spark-shell -> bin/spark-shellPrashant Sharma2014-01-021-1/+1
| | |
| | * Merge branch 'scripts-reorg' of github.com:shane-huang/incubator-spark into ↵Prashant Sharma2014-01-021-2/+2
| | |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | spark-915-segregate-scripts Conflicts: bin/spark-shell core/pom.xml core/src/main/scala/org/apache/spark/SparkContext.scala core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/CoarseMesosSchedulerBackend.scala core/src/main/scala/org/apache/spark/ui/UIWorkloadGenerator.scala core/src/test/scala/org/apache/spark/DriverSuite.scala python/run-tests sbin/compute-classpath.sh sbin/spark-class sbin/stop-slaves.sh
| | | * added spark-class and spark-executor to sbinshane-huang2013-09-231-2/+2
| | | | | | | | | | | | | | | | Signed-off-by: shane-huang <shengsheng.huang@intel.com>
| * | | fix docs for yarnRaymond Liu2014-01-031-3/+0
| | | |
| * | | Update maven build documentationRaymond Liu2014-01-031-1/+1
| | | |
| * | | Fix yarn/README.md and update docs/running-on-yarn.mdRaymond Liu2014-01-031-1/+1
| |/ /
* | | fixed review commentsPrashant Sharma2014-01-031-2/+2
| | |
* | | Removed sbt folder and changed docs accordinglyPrashant Sharma2014-01-021-3/+3
|/ /
* | Fix list rendering in YARN markdown docs.Patrick Wendell2013-12-101-5/+7
| |
* | Various broken links in documentationPatrick Wendell2013-12-071-2/+2
| |
* | more docsAli Ghodsi2013-12-061-1/+1
| |
* | Updated documentation about the YARN v2.2 build processAli Ghodsi2013-12-061-0/+8
| |
* | Add YarnClientClusterScheduler and Backend.Raymond Liu2013-11-221-2/+25
| | | | | | | | | | | | | | With this scheduler, the user application is launched locally, While the executor will be launched by YARN on remote nodes. This enables spark-shell to run upon YARN.
* | Impove Spark on Yarn Error handlingtgravescs2013-11-191-0/+2
| |
* | Allow spark on yarn to be run from HDFS. Allows the spark.jar, app.jar, and ↵tgravescs2013-11-041-0/+1
| | | | | | | | log4j.properties to be put into hdfs.
* | Merge remote-tracking branch 'tgravescs/sparkYarnDistCache'Matei Zaharia2013-10-101-1/+8
|\ \ | | | | | | | | | | | | | | | | | | | | | Closes #11 Conflicts: docs/running-on-yarn.md yarn/src/main/scala/org/apache/spark/deploy/yarn/ClientArguments.scala
| * | Adding in the --addJars option to make SparkContext.addJar work on yarn and ↵tgravescs2013-10-031-0/+2
| | | | | | | | | | | | | | | | | | cleanup the classpaths
| * | Support distributed cache files and archives on spark on yarn and attempt to ↵Y.CORP.YAHOO.COM\tgraves2013-09-231-1/+6
| |/ | | | | | | cleanup the staging directory on exit
* / Allow users to set the application name for Spark on Yarntgravescs2013-10-021-0/+1
|/
* Clarify YARN exampleJey Kottalam2013-09-061-9/+22
|
* Review comment changes and update to org.apache packagingY.CORP.YAHOO.COM\tgraves2013-09-031-3/+3
|