spark - Mirror of Apache Spark

	Commit message (Collapse)	Author	Age	Files	Lines
*	[SPARK-1753 / 1773 / 1814] Update outdated docs for spark-submit, YARN, ↵	Andrew Or	2014-05-12	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	standalone etc. YARN - SparkPi was updated to not take in master as an argument; we should update the docs to reflect that. - The default YARN build guide should be in maven, not sbt. - This PR also adds a paragraph on steps to debug a YARN application. Standalone - Emphasize spark-submit more. Right now it's one small paragraph preceding the legacy way of launching through `org.apache.spark.deploy.Client`. - The way we set configurations / environment variables according to the old docs is outdated. This needs to reflect changes introduced by the Spark configuration changes we made. In general, this PR also adds a little more documentation on the new spark-shell, spark-submit, spark-defaults.conf etc here and there. Author: Andrew Or <andrewor14@gmail.com> Closes #701 from andrewor14/yarn-docs and squashes the following commits: e2c2312 [Andrew Or] Merge in changes in #752 (SPARK-1814) 25cfe7b [Andrew Or] Merge in the warning from SPARK-1753 a8c39c5 [Andrew Or] Minor changes 336bbd9 [Andrew Or] Tabs -> spaces 4d9d8f7 [Andrew Or] Merge branch 'master' of github.com:apache/spark into yarn-docs 041017a [Andrew Or] Abstract Spark submit documentation to cluster-overview.html 3cc0649 [Andrew Or] Detail how to set configurations + remove legacy instructions 5b7140a [Andrew Or] Merge branch 'master' of github.com:apache/spark into yarn-docs 85a51fc [Andrew Or] Update run-example, spark-shell, configuration etc. c10e8c7 [Andrew Or] Merge branch 'master' of github.com:apache/spark into yarn-docs 381fe32 [Andrew Or] Update docs for standalone mode 757c184 [Andrew Or] Add a note about the requirements for the debugging trick f8ca990 [Andrew Or] Merge branch 'master' of github.com:apache/spark into yarn-docs 924f04c [Andrew Or] Revert addition of --deploy-mode d5fe17b [Andrew Or] Update the YARN docs
*	[SPARK-1780] Non-existent SPARK_DAEMON_OPTS is lurking around	Andrew Or	2014-05-12	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	What they really mean is SPARK_DAEMON_*JAVA*_OPTS Author: Andrew Or <andrewor14@gmail.com> Closes #751 from andrewor14/spark-daemon-opts and squashes the following commits: 70c41f9 [Andrew Or] SPARK_DAEMON_OPTS -> SPARK_DAEMON_JAVA_OPTS
*	Assorted clean-up for Spark-on-YARN.	Patrick Wendell	2014-04-22	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \|	In particular when the HADOOP_CONF_DIR is not not specified. Author: Patrick Wendell <pwendell@gmail.com> Closes #488 from pwendell/hadoop-cleanup and squashes the following commits: fe95f13 [Patrick Wendell] Changes based on Andrew's feeback 18d09c1 [Patrick Wendell] Review comments from Andrew 17929cc [Patrick Wendell] Assorted clean-up for Spark-on-YARN.
*	Clean up and simplify Spark configuration	Patrick Wendell	2014-04-21	1	-12/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Over time as we've added more deployment modes, this have gotten a bit unwieldy with user-facing configuration options in Spark. Going forward we'll advise all users to run `spark-submit` to launch applications. This is a WIP patch but it makes the following improvements: 1. Improved `spark-env.sh.template` which was missing a lot of things users now set in that file. 2. Removes the shipping of SPARK_CLASSPATH, SPARK_JAVA_OPTS, and SPARK_LIBRARY_PATH to the executors on the cluster. This was an ugly hack. Instead it introduces config variables spark.executor.extraJavaOpts, spark.executor.extraLibraryPath, and spark.executor.extraClassPath. 3. Adds ability to set these same variables for the driver using `spark-submit`. 4. Allows you to load system properties from a `spark-defaults.conf` file when running `spark-submit`. This will allow setting both SparkConf options and other system properties utilized by `spark-submit`. 5. Made `SPARK_LOCAL_IP` an environment variable rather than a SparkConf property. This is more consistent with it being set on each node. Author: Patrick Wendell <pwendell@gmail.com> Closes #299 from pwendell/config-cleanup and squashes the following commits: 127f301 [Patrick Wendell] Improvements to testing a006464 [Patrick Wendell] Moving properties file template. b4b496c [Patrick Wendell] spark-defaults.properties -> spark-defaults.conf 0086939 [Patrick Wendell] Minor style fixes af09e3e [Patrick Wendell] Mention config file in docs and clean-up docs b16e6a2 [Patrick Wendell] Cleanup of spark-submit script and Scala quick start guide af0adf7 [Patrick Wendell] Automatically add user jar a56b125 [Patrick Wendell] Responses to Tom's review d50c388 [Patrick Wendell] Merge remote-tracking branch 'apache/master' into config-cleanup a762901 [Patrick Wendell] Fixing test failures ffa00fe [Patrick Wendell] Review feedback fda0301 [Patrick Wendell] Note 308f1f6 [Patrick Wendell] Properly escape quotes and other clean-up for YARN e83cd8f [Patrick Wendell] Changes to allow re-use of test applications be42f35 [Patrick Wendell] Handle case where SPARK_HOME is not set c2a2909 [Patrick Wendell] Test compile fixes 4ee6f9d [Patrick Wendell] Making YARN doc changes consistent afc9ed8 [Patrick Wendell] Cleaning up line limits and two compile errors. b08893b [Patrick Wendell] Additional improvements. ace4ead [Patrick Wendell] Responses to review feedback. b72d183 [Patrick Wendell] Review feedback for spark env file 46555c1 [Patrick Wendell] Review feedback and import clean-ups 437aed1 [Patrick Wendell] Small fix 761ebcd [Patrick Wendell] Library path and classpath for drivers 7cc70e4 [Patrick Wendell] Clean up terminology inside of spark-env script 5b0ba8e [Patrick Wendell] Don't ship executor envs 84cc5e5 [Patrick Wendell] Small clean-up 1f75238 [Patrick Wendell] SPARK_JAVA_OPTS --> SPARK_MASTER_OPTS for master settings 4982331 [Patrick Wendell] Remove SPARK_LIBRARY_PATH 6eaf7d0 [Patrick Wendell] executorJavaOpts 0faa3b6 [Patrick Wendell] Stash of adding config options in submit script and YARN ac2d65e [Patrick Wendell] Change spark.local.dir -> SPARK_LOCAL_DIRS
*	Revert "[SPARK-1150] fix repo location in create script"	Patrick Wendell	2014-03-01	1	-1/+1
\| \| \| \|	This reverts commit 9aa095711858ce8670e51488f66a3d7c1a821c30.
*	[SPARK-1150] fix repo location in create script	Mark Grover	2014-03-01	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	https://spark-project.atlassian.net/browse/SPARK-1150 fix the repo location in create_release script Author: Mark Grover <mark@apache.org> Closes #48 from CodingCat/script_fixes and squashes the following commits: 01f4bf7 [Mark Grover] Fixing some nitpicks d2244d4 [Mark Grover] SPARK-676: Abbreviation in SPARK_MEM but not in SPARK_WORKER_MEMORY
*	[SPARK-1041] remove dead code in start script, remind user to set that in ↵	CodingCat	2014-02-22	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	spark-env.sh the lines in start-master.sh and start-slave.sh no longer work in ec2, the host name has changed, e.g. ubuntu@ip-172-31-36-93:~$ hostname ip-172-31-36-93 also, the URL to fetch public DNS name also changed, e.g. ubuntu@ip-172-31-36-93:~$ wget -q -O - http://instance-data.ec2.internal/latest/meta-data/public-hostname ubuntu@ip-172-31-36-93:~$ (returns nothing) since we have spark-ec2 project, we don't need to have such ec2-specific lines here, instead, user only need to set in spark-env.sh Author: CodingCat <zhunansjtu@gmail.com> Closes #588 from CodingCat/deadcode_in_sbin and squashes the following commits: e4236e0 [CodingCat] remove dead code in start script, remind user set that in spark-env.sh
*	add the comments about SPARK_WORKER_DIR	CodingCat	2014-01-07	1	-1/+1
\| \| \| \|	this env variable seems to be forgotten …
*	Another fix suggested by Patrick	Matei Zaharia	2013-08-31	1	-1/+1
\|
*	Fixes suggested by Patrick	Matei Zaharia	2013-08-31	1	-1/+1
\|
*	More updates, describing changes to recommended use of environment vars	Matei Zaharia	2013-08-31	1	-13/+10
\| \| \| \|	and new Python stuff
*	add comment in spark-env.sh.template for SPARK_JAVA_OPTS	shane-huang	2013-08-09	1	-0/+5
\| \| \| \|	Signed-off-by: shane-huang <shengsheng.huang@intel.com>
*	Update docs on SCALA_LIBRARY_PATH	Matei Zaharia	2013-06-30	1	-12/+6
\|
*	added SPARK_WORKER_INSTANCES : allows spawning multiple worker ↵	kalpit	2013-03-26	1	-0/+1
\| \| \| \|	instances/processes on every slave machine
*	Document how to configure SPARK_MEM & co on a per-job basis	Matei Zaharia	2012-10-13	1	-13/+16
\|
*	Settings variables and bugfix for stop script.	Denny	2012-08-02	1	-1/+9
\|
*	Spark standalone mode cluster scripts.	Denny	2012-08-01	1	-1/+1
\| \| \| \|	Heavily inspired by Hadoop cluster scripts ;-)
*	Further fixes to how Mesos is found and used	Matei Zaharia	2012-03-17	1	-1/+1
\|
*	Undid some changes that Mosharaf inadvertedly committed to master.	Matei Zaharia	2010-10-19	1	-1/+1
\|
*	Merge branch 'master' of git@github.com:mesos/spark	Mosharaf Chowdhury	2010-10-18	1	-1/+1
\| \| \| \| \| \| \|	Conflicts: src/scala/spark/SparkContext.scala Using the latest one from Matei.
*	Changed the config files that were included in git to templates which	Matei Zaharia	2010-10-16	1	-0/+13
	are used to create an initial copy of each config file if the user does not have one. This way, users won't accidentally commit their changes to config files to git.