spark - Mirror of Apache Spark

	Commit message (Collapse)	Author	Age	Files	Lines
*	Adding unit tests and some refactoring to promote testability.	Patrick Wendell	2014-01-07	10	-35/+264
\|
*	Some doc fixes	Patrick Wendell	2014-01-06	1	-3/+2
\|
*	Fixes after merge	Patrick Wendell	2014-01-06	3	-6/+8
\|
*	Merge remote-tracking branch 'apache-github/master' into standalone-driver	Patrick Wendell	2014-01-06	316	-3469/+4904
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Conflicts: core/src/main/scala/org/apache/spark/deploy/client/AppClient.scala core/src/main/scala/org/apache/spark/deploy/client/TestClient.scala core/src/main/scala/org/apache/spark/deploy/master/Master.scala core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala core/src/main/scala/org/apache/spark/scheduler/cluster/SparkDeploySchedulerBackend.scala
\| *	Merge pull request #343 from pwendell/build-fix	Patrick Wendell	2014-01-06	1	-1/+1
\| \|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix test breaking downstream builds This wasn't detected in the pull-request-builder because it manually sets SPARK_HOME. I'm going to change that (it should't do this) to make it like the other builds.
\| \| *	Fix test breaking downstream builds	Patrick Wendell	2014-01-06	1	-1/+1
\| \|/
\| *	Merge pull request #340 from ScrapCodes/sbt-fixes	Patrick Wendell	2014-01-06	1	-5/+3
\| \|\ \| \| \| \| \| \| \| \| \|	Made java options to be applied during tests so that they become self explanatory.
\| \| *	Made java options to be applied during tests so that they become self ↵	Prashant Sharma	2014-01-06	1	-5/+3
\| \| \| \| \| \| \| \| \| \| \| \|	explanatory.
\| * \|	Merge pull request #338 from ScrapCodes/ning-upgrade	Patrick Wendell	2014-01-06	2	-2/+2
\| \|\ \ \| \| \| \| \| \| \| \| \| \| \| \|	SPARK-1005 Ning upgrade
\| \| * \|	SPARK-1005 Ning upgrade	Prashant Sharma	2014-01-06	2	-2/+2
\| \| \|/
\| * \|	Merge pull request #341 from ash211/patch-5	Patrick Wendell	2014-01-06	1	-1/+2
\| \|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Clarify spark.cores.max in docs It controls the count of cores across the cluster, not on a per-machine basis.
\| \| * \|	Clarify spark.cores.max	Andrew Ash	2014-01-06	1	-1/+2
\| \| \|/ \| \| \| \| \| \|	It controls the count of cores across the cluster, not on a per-machine basis.
\| * \|	Merge pull request #342 from tgravescs/fix_maven_protobuf	Patrick Wendell	2014-01-06	1	-1/+0
\| \|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Change protobuf version for yarn alpha back to 2.4.1 The maven build for yarn-alpha uses the wrong protobuf version and hence the generated assembly jar doesn't work with Hadoop 0.23. Removing the setting for the yarn-alpha profile since the default protobuf version is 2.4.1 at the top of the pom file.
\| \| * \|	Change protobuf version for yarn alpha back to 2.4.1	Thomas Graves	2014-01-06	1	-1/+0
\| \| \|/
\| * \|	Merge pull request #330 from tgravescs/fix_addjars_null_handling	Patrick Wendell	2014-01-06	1	-2/+3
\| \|\ \ \| \| \|/ \| \|/\| \| \| \| \| \| \| \| \| \|	Fix handling of empty SPARK_EXAMPLES_JAR Currently if SPARK_EXAMPLES_JAR is left unset you get a null pointer exception when running the examples (atleast on spark on yarn). The null now gets turned into a string of "null" when its put into the SparkConf so addJar no longer properly ignores it. This fixes that so that it can be left unset.
\| \| *	Add warning to null setJars check	Thomas Graves	2014-01-06	1	-1/+2
\| \| \|
\| \| *	Fix handling of empty SPARK_EXAMPLES_JAR	Thomas Graves	2014-01-04	1	-1/+1
\| \| \|
\| * \|	Merge pull request #333 from pwendell/logging-silence	Patrick Wendell	2014-01-05	2	-3/+25
\| \|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Quiet ERROR-level Akka Logs This fixes an issue I've seen where akka logs a bunch of things at ERROR level when connecting to a standalone cluster, even in the normal case. I noticed that even when lifecycle logging was disabled, the netty code inside of akka still logged away via akka's EndpointWriter class. There are also some other log streams that I think are new in akka 2.2.1 that I've disabled. Finally, I added some better logging to the standalone client. This makes it more clear when a connection failure occurs what is going on. Previously it never explicitly said if a connection attempt had failed. The commit messages here have some more detail.
\| \| * \|	Responding to Aaron's review	Patrick Wendell	2014-01-05	1	-0/+2
\| \| \| \|
\| \| * \|	Provide logging when attempts to connect to the master fail.	Patrick Wendell	2014-01-05	1	-1/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Without these it's a bit less clear what's going on for the user. One thing I realize when doing this is that akka itself actually retries the initial association. So the retry we currently have is redundant with akka's.
\| \| * \|	Quite akka when remote lifecycle logging is disabled.	Patrick Wendell	2014-01-05	1	-2/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I noticed when connecting to a standalone cluster Spark gives a bunch of Akka ERROR logs that make it seem like something is failing. This patch does two things: 1. Akka dead letter logging is turned on/off according to the existing lifecycle spark property. 2. We explicitly silence akka's EndpointWriter log in log4j. This is necessary because for some reason that log doesn't pick up on the lifecycle logging settings. After a few hours of debugging this was the only solution I found that worked.
\| * \| \|	Merge pull request #334 from pwendell/examples-fix	Reynold Xin	2014-01-05	47	-54/+75
\| \|\ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Removing SPARK_EXAMPLES_JAR in the code This re-writes all of the examples to use the `SparkContext.jarOfClass` mechanism for loading the examples jar. This necessary for environments like YARN and the Standalone mode where example programs will be submit from inside the cluster rather than at the client using `./spark-example`. This still leaves SPARK_EXAMPLES_JAR in place in the shell scripts for setting up the classpath if `./spark-example` is run.
\| \| * \| \|	Removing SPARK_EXAMPLES_JAR in the code	Patrick Wendell	2014-01-05	47	-54/+75
\| \| \|/ /
\| * \| \|	Merge pull request #335 from rxin/ser	Reynold Xin	2014-01-05	2	-2/+16
\| \|\ \ \ \| \| \|/ / \| \|/\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fall back to zero-arg constructor for Serializer initialization if there is no constructor that accepts SparkConf. This maintains backward compatibility with older serializers implemented by users.
\| \| * \|	Fall back to zero-arg constructor for Serializer initialization if there is ↵	Reynold Xin	2014-01-05	2	-2/+16
\| \|/ / \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	no constructor that accepts SparkConf. This maintains backward compatibility with older serializers implemented by users.
\| * \|	Merge pull request #292 from soulmachine/naive-bayes	Reynold Xin	2014-01-04	2	-0/+227
\| \|\ \ \| \| \|/ \| \|/\| \| \| \| \| \| \| \| \| \|	standard Naive Bayes classifier Has implemented the standard Naive Bayes classifier. This is an updated version of #288, which is closed because of misoperations.
\| \| *	Aggregated all sample points to driver without any shuffle	Lian, Cheng	2014-01-02	2	-53/+31
\| \| \|
\| \| *	Response to comments from Reynold, Ameet and Evan	Lian, Cheng	2013-12-30	2	-62/+90
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Arguments renamed according to Ameet's suggestion * Using DoubleMatrix instead of Array[Double] in computation * Removed arguments C (kinds of label) and D (dimension of feature vector) from NaiveBayes.train() * Replaced reduceByKey with foldByKey to avoid modifying original input data
\| \| *	Response to Reynold's comments	Lian, Cheng	2013-12-29	1	-10/+16
\| \| \|
\| \| *	Added Apache license header to NaiveBayesSuite	Lian, Cheng	2013-12-27	1	-0/+17
\| \| \|
\| \| *	Reformatted some lines commented by Matei	Lian, Cheng	2013-12-27	1	-2/+3
\| \| \|
\| \| *	Let reduceByKey to take care of local combine	Lian, Cheng	2013-12-25	1	-27/+16
\| \| \| \| \| \| \| \| \| \| \| \|	Also refactored some heavy FP code to improve readability and reduce memory footprint.
\| \| *	Refactored NaiveBayes	Lian, Cheng	2013-12-25	2	-28/+41
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Minimized shuffle output with mapPartitions. * Reduced RDD actions from 3 to 1.
\| \| *	standard Naive Bayes classifier	Frank Dai	2013-12-25	2	-0/+195
\| \| \|
\| * \|	Merge pull request #329 from pwendell/remove-binaries	Patrick Wendell	2014-01-03	33	-210/+128
\| \|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	SPARK-1002: Remove Binaries from Spark Source This adds a few changes on top of the work by @scrapcodes.
\| \| * \	Merge remote-tracking branch 'apache-github/master' into remove-binaries	Patrick Wendell	2014-01-03	98	-1453/+436
\| \| \|\ \ \| \| \|/ / \| \|/\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Conflicts: core/src/test/scala/org/apache/spark/DriverSuite.scala docs/python-programming-guide.md
\| * \| \|	Merge pull request #325 from witgo/master	Patrick Wendell	2014-01-03	13	-66/+92
\| \|\ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Modify spark on yarn to create SparkConf process
\| \| * \ \	merge upstream/master	liguoqiang	2014-01-03	36	-1236/+198
\| \| \|\ \ \
\| \| * \| \| \|	Modify spark on yarn to create SparkConf process	liguoqiang	2014-01-03	7	-48/+65
\| \| \| \| \| \|
\| \| * \| \| \|	Modify spark on yarn to create SparkConf process	liguoqiang	2014-01-03	15	-46/+56
\| \| \| \| \| \|
\| * \| \| \| \|	Merge pull request #317 from ScrapCodes/spark-915-segregate-scripts	Patrick Wendell	2014-01-03	62	-161/+155
\| \|\ \ \ \ \ \| \| \|_\|/ / / \| \|/\| \| \| \| \| \| \| \| \| \|	Spark-915 segregate scripts
\| \| * \| \| \|	sbin/compute-classpath* bin/compute-classpath*	Prashant Sharma	2014-01-03	5	-3/+3
\| \| \| \| \| \|
\| \| * \| \| \|	sbin/spark-class* -> bin/spark-class*	Prashant Sharma	2014-01-03	14	-15/+15
\| \| \| \| \| \|
\| \| * \| \| \|	a few left over document change	Prashant Sharma	2014-01-02	3	-4/+4
\| \| \| \| \| \|
\| \| * \| \| \|	pyspark -> bin/pyspark	Prashant Sharma	2014-01-02	5	-19/+19
\| \| \| \| \| \|
\| \| * \| \| \|	run-example -> bin/run-example	Prashant Sharma	2014-01-02	19	-31/+31
\| \| \| \| \| \|
\| \| * \| \| \|	spark-shell -> bin/spark-shell	Prashant Sharma	2014-01-02	9	-15/+15
\| \| \| \| \| \|
\| \| * \| \| \|	Merge branch 'scripts-reorg' of github.com:shane-huang/incubator-spark into ↵	Prashant Sharma	2014-01-02	41	-96/+90
\| \| \|\ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	spark-915-segregate-scripts Conflicts: bin/spark-shell core/pom.xml core/src/main/scala/org/apache/spark/SparkContext.scala core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/CoarseMesosSchedulerBackend.scala core/src/main/scala/org/apache/spark/ui/UIWorkloadGenerator.scala core/src/test/scala/org/apache/spark/DriverSuite.scala python/run-tests sbin/compute-classpath.sh sbin/spark-class sbin/stop-slaves.sh
\| \| \| * \| \| \|	deprecate "spark" script and SPAKR_CLASSPATH environment variable	Andrew xia	2013-10-12	7	-99/+4
\| \| \| \| \| \| \|
\| \| \| * \| \| \|	refactor $FWD variable	Andrew xia	2013-09-29	5	-7/+7
\| \| \| \| \| \| \|