spark - Mirror of Apache Spark

	Commit message (Collapse)	Author	Age	Files	Lines
*	SPARK-929: Fully deprecate usage of SPARK_MEM	Aaron Davidson	2014-03-09	3	-46/+77
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(Continued from old repo, prior discussion at https://github.com/apache/incubator-spark/pull/615) This patch cements our deprecation of the SPARK_MEM environment variable by replacing it with three more specialized variables: SPARK_DAEMON_MEMORY, SPARK_EXECUTOR_MEMORY, and SPARK_DRIVER_MEMORY The creation of the latter two variables means that we can safely set driver/job memory without accidentally setting the executor memory. Neither is public. SPARK_EXECUTOR_MEMORY is only used by the Mesos scheduler (and set within SparkContext). The proper way of configuring executor memory is through the "spark.executor.memory" property. SPARK_DRIVER_MEMORY is the new way of specifying the amount of memory run by jobs launched by spark-class, without possibly affecting executor memory. Other memory considerations: - The repl's memory can be set through the "--drivermem" command-line option, which really just sets SPARK_DRIVER_MEMORY. - run-example doesn't use spark-class, so the only way to modify examples' memory is actually an unusual use of SPARK_JAVA_OPTS (which is normally overriden in all cases by spark-class). This patch also fixes a lurking bug where spark-shell misused spark-class (the first argument is supposed to be the main class name, not java options), as well as a bug in the Windows spark-class2.cmd. I have not yet tested this patch on either Windows or Mesos, however. Author: Aaron Davidson <aaron@databricks.com> Closes #99 from aarondav/sparkmem and squashes the following commits: 9df4c68 [Aaron Davidson] SPARK-929: Fully deprecate usage of SPARK_MEM
*	[SPARK-1090] improvement on spark_shell (help information, configure memory)	CodingCat	2014-02-17	1	-6/+42
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	https://spark-project.atlassian.net/browse/SPARK-1090 spark-shell should print help information about parameters and should allow user to configure exe memory there is no document about hot to set --cores/-c in spark-shell and also users should be able to set executor memory through command line options In this PR I also check the format of the options passed by the user Author: CodingCat <zhunansjtu@gmail.com> Closes #599 from CodingCat/spark_shell_improve and squashes the following commits: de5aa38 [CodingCat] add parameter to set driver memory 915cbf8 [CodingCat] improvement on spark_shell (help information, configure memory)
*	Merge pull request #534 from sslavic/patch-1. Closes #534.	Stevo Slavić	2014-02-04	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixed wrong path to compute-classpath.cmd compute-classpath.cmd is in bin, not in sbin directory Author: Stevo Slavić <sslavic@gmail.com> == Merge branch commits == commit 23deca32b69e9429b33ad31d35b7e1bfc9459f59 Author: Stevo Slavić <sslavic@gmail.com> Date: Tue Feb 4 15:01:47 2014 +0100 Fixed wrong path to compute-classpath.cmd compute-classpath.cmd is in bin, not in sbin directory
*	Merge pull request #484 from tdas/run-example-fix	Patrick Wendell	2014-01-20	1	-2/+11
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Made run-example respect SPARK_JAVA_OPTS and SPARK_MEM. bin/run-example scripts was not passing Java properties set through the SPARK_JAVA_OPTS to the example. This is important for examples like Twitter** as the Twitter authentication information must be set through java properties. Hence added the same JAVA_OPTS code in run-example as it is in bin/spark-class script. Also added SPARK_MEM, in case someone wants to run the example with different amounts of memory. This can be removed if it is not tune with the intended semantics of the run-example scripts. @matei Please check this soon I want this to go in 0.9-rc4
\| *	Removed SPARK_MEM from run-examples.	Tathagata Das	2014-01-20	1	-5/+0
\| \|
\| *	Made run-example respect SPARK_JAVA_OPTS and SPARK_MEM.	Tathagata Das	2014-01-20	1	-2/+16
\| \|
* \|	Merge pull request #449 from CrazyJvm/master	Reynold Xin	2014-01-20	1	-3/+8
\|\ \ \| \|/ \|/\| \| \| \| \| \| \| \| \|	SPARK-1028 : fix "set MASTER automatically fails" bug. spark-shell intends to set MASTER automatically if we do not provide the option when we start the shell , but there's a problem. The condition is "if [[ "x" != "x$SPARK_MASTER_IP" && "y" != "y$SPARK_MASTER_PORT" ]];" we sure will set SPARK_MASTER_IP explicitly, the SPARK_MASTER_PORT option, however, we probably do not set just using spark default port 7077. So if we do not set SPARK_MASTER_PORT, the condition will never be true. We should just use default port if users do not set port explicitly I think.
\| *	fix some format problem.	CrazyJvm	2014-01-16	1	-2/+2
\| \|
\| *	fix "set MASTER automatically fails" bug.	CrazyJvm	2014-01-16	1	-3/+8
\| \| \| \| \| \| \| \|	spark-shell intends to set MASTER automatically if we do not provide the option when we start the shell , but there's a problem. The condition is "if [[ "x" != "x$SPARK_MASTER_IP" && "y" != "y$SPARK_MASTER_PORT" ]];" we sure will set SPARK_MASTER_IP explicitly, the SPARK_MASTER_PORT option, however, we probably do not set just using spark default port 7077. So if we do not set SPARK_MASTER_PORT, the condition will never be true. We should just use default port if users do not set port explicitly I think.
* \|	Fixed Window spark shell launch script error.	Qiuzhuang Lian	2014-01-16	2	-3/+3
\|/ \| \| \|	JIRA SPARK-1029:https://spark-project.atlassian.net/browse/SPARK-1029
*	Merge branch 'master' into graphx	Reynold Xin	2014-01-13	1	-6/+1
\|\
\| *	Merge pull request #353 from pwendell/ipython-simplify	Patrick Wendell	2014-01-09	1	-6/+1
\| \|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Simplify and fix pyspark script. This patch removes compatibility for IPython < 1.0 but fixes the launch script and makes it much simpler. I tested this using the three commands in the PySpark documentation page: 1. IPYTHON=1 ./pyspark 2. IPYTHON_OPTS="notebook" ./pyspark 3. IPYTHON_OPTS="notebook --pylab inline" ./pyspark There are two changes: - We rely on PYTHONSTARTUP env var to start PySpark - Removed the quotes around $IPYTHON_OPTS... having quotes gloms them together as a single argument passed to `exec` which seemed to cause ipython to fail (it instead expects them as multiple arguments).
\| \| *	Small fix suggested by josh	Patrick Wendell	2014-01-09	1	-0/+1
\| \| \|
\| \| *	Simplify and fix pyspark script.	Patrick Wendell	2014-01-07	1	-7/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch removes compatibility for IPython < 1.0 but fixes the launch script and makes it much simpler. I tested this using the three commands in the PySpark documentation page: 1. IPYTHON=1 ./pyspark 2. IPYTHON_OPTS="notebook" ./pyspark 3. IPYTHON_OPTS="notebook --pylab inline" ./pyspark There are two changes: - We rely on PYTHONSTARTUP env var to start PySpark - Removed the quotes around $IPYTHON_OPTS... having quotes gloms them together as a single argument passed to `exec` which seemed to cause ipython to fail (it instead expects them as multiple arguments).
* \| \|	Finish d1d2b6d9b6b5f9cc45047507368a816903722d9e	Ankur Dave	2014-01-10	1	-1/+0
\| \| \|
* \| \|	graph -> graphx in bin/compute-classpath.sh	Ankur Dave	2014-01-09	1	-2/+2
\| \| \|
* \| \|	Merge remote-tracking branch 'spark-upstream/master' into HEAD	Ankur Dave	2014-01-08	24	-612/+709
\|\\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Conflicts: README.md core/src/main/scala/org/apache/spark/util/collection/OpenHashMap.scala core/src/main/scala/org/apache/spark/util/collection/OpenHashSet.scala core/src/main/scala/org/apache/spark/util/collection/PrimitiveKeyOpenHashMap.scala pom.xml project/SparkBuild.scala repl/src/main/scala/org/apache/spark/repl/SparkILoop.scala
\| * \|	Merge pull request #313 from tdas/project-refactor	Patrick Wendell	2014-01-07	1	-6/+1
\| \|\ \ \| \| \|/ \| \|/\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Refactored the streaming project to separate external libraries like Twitter, Kafka, Flume, etc. At a high level, these are the following changes. 1. All the external code was put in `SPARK_HOME/external/` as separate SBT projects and Maven modules. Their artifact names are `spark-streaming-twitter`, `spark-streaming-kafka`, etc. Both SparkBuild.scala and pom.xml files have been updated. References to external libraries and repositories have been removed from the settings of root and streaming projects/modules. 2. To avail the external functionality (say, creating a Twitter stream), the developer has to `import org.apache.spark.streaming.twitter._` . For Scala API, the developer has to call `TwitterUtils.createStream(streamingContext, ...)`. For the Java API, the developer has to call `TwitterUtils.createStream(javaStreamingContext, ...)`. 3. Each external project has its own scala and java unit tests. Note the unit tests of each external library use classes of the streaming unit tests (`TestSuiteBase`, `LocalJavaStreamingContext`, etc.). To enable this code sharing among test classes, `dependsOn(streaming % "compile->compile,test->test")` was used in the SparkBuild.scala . In the streaming/pom.xml, an additional `maven-jar-plugin` was necessary to capture this dependency (see comment inside the pom.xml for more information). 4. Jars of the external projects have been added to examples project but not to the assembly project. 5. In some files, imports have been rearrange to conform to the Spark coding guidelines.
\| \| *	Fixed examples/pom.xml and run-example based on Patrick's suggestions.	Tathagata Das	2014-01-07	1	-6/+1
\| \| \|
\| * \|	CR feedback (sbt -> sbt/sbt and correct JAR path in script) :)	Holden Karau	2014-01-05	1	-1/+1
\| \| \|
\| * \|	Finish documentation changes	Holden Karau	2014-01-05	2	-2/+2
\| \|/
\| *	Merge remote-tracking branch 'apache-github/master' into remove-binaries	Patrick Wendell	2014-01-03	24	-610/+712
\| \|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Conflicts: core/src/test/scala/org/apache/spark/DriverSuite.scala docs/python-programming-guide.md
\| \| *	sbin/compute-classpath* bin/compute-classpath*	Prashant Sharma	2014-01-03	4	-2/+146
\| \| \|
\| \| *	sbin/spark-class* -> bin/spark-class*	Prashant Sharma	2014-01-03	6	-4/+266
\| \| \|
\| \| *	run-example -> bin/run-example	Prashant Sharma	2014-01-02	2	-2/+2
\| \| \|
\| \| *	Merge branch 'scripts-reorg' of github.com:shane-huang/incubator-spark into ↵	Prashant Sharma	2014-01-02	21	-752/+448
\| \|/\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	spark-915-segregate-scripts Conflicts: bin/spark-shell core/pom.xml core/src/main/scala/org/apache/spark/SparkContext.scala core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/CoarseMesosSchedulerBackend.scala core/src/main/scala/org/apache/spark/ui/UIWorkloadGenerator.scala core/src/test/scala/org/apache/spark/DriverSuite.scala python/run-tests sbin/compute-classpath.sh sbin/spark-class sbin/stop-slaves.sh
\| \| *	deprecate "spark" script and SPAKR_CLASSPATH environment variable	Andrew xia	2013-10-12	1	-92/+0
\| \| \|
\| \| *	refactor $FWD variable	Andrew xia	2013-09-29	3	-4/+4
\| \| \|
\| \| *	rm bin/spark.cmd as we don't have windows test environment. Will added it ↵	shane-huang	2013-09-26	1	-27/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	later if needed Signed-off-by: shane-huang <shengsheng.huang@intel.com>
\| \| *	fix paths and change spark to use APP_MEM as application driver memory ↵	shane-huang	2013-09-26	1	-33/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	instead of SPARK_MEM, user should add application jars to SPARK_CLASSPATH Signed-off-by: shane-huang <shengsheng.huang@intel.com>
\| \| *	add scripts in bin	shane-huang	2013-09-23	8	-10/+155
\| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: shane-huang <shengsheng.huang@intel.com>
\| \| *	moved user scripts to bin folder	shane-huang	2013-09-23	8	-0/+418
\| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: shane-huang <shengsheng.huang@intel.com>
\| \| *	add admin scripts to sbin	shane-huang	2013-09-23	13	-704/+0
\| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: shane-huang <shengsheng.huang@intel.com>
\| \| *	added spark-class and spark-executor to sbin	shane-huang	2013-09-23	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: shane-huang <shengsheng.huang@intel.com>
\| * \|	Merge branch 'master' into scala-2.10	Raymond Liu	2013-11-13	5	-9/+57
\| \|\ \
\| * \| \|	version changed 2.9.3 -> 2.10 in shell script.	Prashant Sharma	2013-09-15	1	-1/+1
\| \| \| \|
\| * \| \|	Merged with master	Prashant Sharma	2013-09-06	13	-121/+251
\| \|\ \ \ \| \| \| \|/ \| \| \|/\|
\| * \| \|	Merge branch 'master' of github.com:mesos/spark into scala-2.10	Prashant Sharma	2013-07-15	2	-49/+65
\| \|\ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Conflicts: core/src/main/scala/spark/Utils.scala core/src/test/scala/spark/ui/UISuite.scala project/SparkBuild.scala run
\| * \ \ \	Merge branch 'master' into master-merge	Prashant Sharma	2013-07-12	2	-0/+4
\| \|\ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Conflicts: README.md core/pom.xml core/src/main/scala/spark/deploy/JsonProtocol.scala core/src/main/scala/spark/deploy/LocalSparkCluster.scala core/src/main/scala/spark/deploy/master/Master.scala core/src/main/scala/spark/deploy/master/MasterWebUI.scala core/src/main/scala/spark/deploy/worker/Worker.scala core/src/main/scala/spark/deploy/worker/WorkerWebUI.scala core/src/main/scala/spark/storage/BlockManagerUI.scala core/src/main/scala/spark/util/AkkaUtils.scala pom.xml project/SparkBuild.scala streaming/src/main/scala/spark/streaming/receivers/ActorReceiver.scala
\| * \| \| \| \|	Removed some unnecessary code and fixed dependencies	Prashant Sharma	2013-07-11	1	-1/+1
\| \| \| \| \| \|
* \| \| \| \| \|	Added GraphX to classpath.	Reynold Xin	2013-11-07	1	-0/+1
\| \| \| \| \| \|
* \| \| \| \| \|	Merge remote-tracking branch 'spark-upstream/master'	Ankur Dave	2013-10-30	1	-4/+18
\|\ \ \ \ \ \ \| \| \|_\|_\|_\|/ \| \|/\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Conflicts: project/SparkBuild.scala
\| * \| \| \| \|	Merge pull request #66 from shivaram/sbt-assembly-deps	Matei Zaharia	2013-10-18	1	-4/+18
\| \|\ \ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add SBT target to assemble dependencies This pull request is an attempt to address the long assembly build times during development. Instead of rebuilding the assembly jar for every Spark change, this pull request adds a new SBT target `spark` that packages all the Spark modules and builds an assembly of the dependencies. So the work flow that should work now would be something like ``` ./sbt/sbt spark # Doing this once should suffice ## Make changes ./sbt/sbt compile ./sbt/sbt test or ./spark-shell ```
\| \| * \| \| \| \|	Exclude assembly jar from classpath if using deps	Shivaram Venkataraman	2013-10-16	1	-10/+18
\| \| \| \| \| \| \|
\| \| * \| \| \| \|	Merge branch 'master' of https://github.com/apache/incubator-spark into ↵	Shivaram Venkataraman	2013-10-15	1	-2/+0
\| \| \|\ \ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	sbt-assembly-deps
\| \| * \| \| \| \| \|	Add new SBT target for dependency assembly	Shivaram Venkataraman	2013-10-09	1	-0/+6
\| \| \| \|_\|_\|_\|/ \| \| \|/\| \| \| \|
* \| \| \| \| \| \|	Merge branch 'master' of https://github.com/apache/incubator-spark into ↵	Joseph E. Gonzalez	2013-10-18	3	-3/+39
\|\\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	indexedrdd_graphx
\| * \| \| \| \| \|	SPARK-627 , Implementing --config arguments in the scripts	KarthikTunga	2013-10-16	1	-1/+1
\| \| \| \| \| \| \|
\| * \| \| \| \| \|	SPARK-627 , Implementing --config arguments in the scripts	KarthikTunga	2013-10-16	2	-2/+2
\| \| \| \| \| \| \|
\| * \| \| \| \| \|	Implementing --config argument in the scripts	KarthikTunga	2013-10-16	2	-7/+10
\| \| \| \| \| \| \|