spark - Mirror of Apache Spark

	Commit message (Collapse)	Author	Age	Files	Lines
...
\| * \| \| \| \| \| \|	Merge branch 'master' into blockmanager_info	Imran Rashid	2013-01-29	23	-192/+207
\| \|\ \ \ \ \ \ \
\| * \| \| \| \| \| \| \|	better formatting for RDDInfo	Imran Rashid	2013-01-28	1	-3/+9
\| \| \| \| \| \| \| \| \|
\| * \| \| \| \| \| \| \|	expose RDD & storage info directly via SparkContext	Imran Rashid	2013-01-28	4	-28/+41
\| \| \| \| \| \| \| \| \|
* \| \| \| \| \| \| \| \|	Merge pull request #436 from stephenh/removeextraloop	Matei Zaharia	2013-02-02	1	-13/+10
\|\ \ \ \ \ \ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Once we find a split with no block, we don't have to look for more.
\| * \| \| \| \| \| \| \| \|	Further simplify checking for Nil.	Stephen Haberman	2013-02-02	1	-3/+1
\| \| \| \| \| \| \| \| \| \|
\| * \| \| \| \| \| \| \| \|	Once we find a split with no block, we don't have to look for more.	Stephen Haberman	2013-01-31	1	-12/+11
\| \| \|_\|/ / / / / / \| \|/\| \| \| \| \| \| \|
* \| \| \| \| \| \| \| \|	Merge pull request #442 from stephenh/fixsystemnames	Matei Zaharia	2013-02-02	6	-73/+68
\|\ \ \ \ \ \ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix createActorSystem not actually using the systemName parameter.
\| * \| \| \| \| \| \| \| \|	Fix createActorSystem not actually using the systemName parameter.	Stephen Haberman	2013-02-02	6	-73/+68
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This meant all system names were "spark", which worked, but didn't lead to the most intuitive log output. This fixes createActorSystem to use the passed system name, and refactors Master/Worker to encapsulate their system/actor names instead of having the clients guess at them. Note that the driver system name, "spark", is left as is, and is still repeated a few times, but that seems like a separate issue.
* \| \| \| \| \| \| \| \| \|	Formatting	Matei Zaharia	2013-02-02	1	-1/+2
\| \| \| \| \| \| \| \| \| \|
* \| \| \| \| \| \| \| \| \|	Formatting	Matei Zaharia	2013-02-02	1	-6/+9
\| \| \| \| \| \| \| \| \| \|
* \| \| \| \| \| \| \| \| \|	Merge pull request #427 from woggling/dag-sched-tests	Matei Zaharia	2013-02-02	6	-73/+802
\|\ \ \ \ \ \ \ \ \ \ \| \|_\|_\|_\|_\|_\|/ / / / \|/\| \| \| \| \| \| \| \| \|	Tests for DAGScheduler
\| * \| \| \| \| \| \| \| \|	Merge remote-tracking branch 'base/master' into dag-sched-tests	Charles Reiss	2013-02-02	73	-511/+584
\| \|\ \ \ \ \ \ \ \ \ \| \|/ / / / / / / / / \|/\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Conflicts: core/src/main/scala/spark/scheduler/DAGScheduler.scala
\| * \| \| \| \| \| \| \| \|	Code review changes: add sc.stop; style of multiline comments; parens on ↵	Charles Reiss	2013-02-01	1	-22/+47
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	procedure calls.
\| * \| \| \| \| \| \| \| \|	Comment at top of DAGSchedulerSuite	Charles Reiss	2013-01-30	1	-1/+14
\| \| \| \| \| \| \| \| \| \|
\| * \| \| \| \| \| \| \| \|	Change DAGSchedulerSuite to run DAGScheduler in the same Thread.	Charles Reiss	2013-01-30	1	-249/+319
\| \| \| \| \| \| \| \| \| \|
\| * \| \| \| \| \| \| \| \|	Refactor DAGScheduler more to allow testing without a separate thread.	Charles Reiss	2013-01-30	1	-65/+111
\| \| \| \| \| \| \| \| \| \|
\| * \| \| \| \| \| \| \| \|	Clear spark.master.port to cleanup for other tests	Charles Reiss	2013-01-29	1	-0/+1
\| \| \| \| \| \| \| \| \| \|
\| * \| \| \| \| \| \| \| \|	Add DAGScheduler tests.	Charles Reiss	2013-01-29	1	-0/+540
\| \| \| \| \| \| \| \| \| \|
\| * \| \| \| \| \| \| \| \|	Refactoring to DAGScheduler to aid testing	Charles Reiss	2013-01-29	2	-12/+18
\| \| \| \| \| \| \| \| \| \|
\| * \| \| \| \| \| \| \| \|	Add easymock to SBT configuration.	Charles Reiss	2013-01-29	1	-1/+2
\| \| \| \| \| \| \| \| \| \|
\| * \| \| \| \| \| \| \| \|	Add easymock to POMs	Charles Reiss	2013-01-29	2	-0/+11
\| \| \|_\|_\|/ / / / / \| \|/\| \| \| \| \| \| \|
\| \| \| \| \| \| * \| \|	Handle Terminated to avoid endless DeathPactExceptions.	Stephen Haberman	2013-02-05	2	-19/+13
\| \|_\|_\|_\|_\|/ / / \|/\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Credit to Roland Kuhn, Akka's tech lead, for pointing out this various obvious fix, but StandaloneExecutorBackend.preStart's catch block would never (ever) get hit, because all of the operation's in preStart are async. So, the System.exit in the catch block was skipped, and instead Akka was sending Terminated messages which, since we didn't handle, it turned into DeathPactException, which started a postRestart/preStart infinite loop.
* \| \| \| \| \| \| \|	Add back test for distinct without parens	Matei Zaharia	2013-02-01	1	-1/+2
\| \| \| \| \| \| \| \|
* \| \| \| \| \| \| \|	Merge pull request #441 from stephenh/lessnoisyakka	Matei Zaharia	2013-02-01	1	-0/+1
\|\ \ \ \ \ \ \ \ \| \|_\|/ / / / / / \|/\| \| \| \| \| \| \|	Reduce the amount of duplicate logging Akka does to stdout.
\| * \| \| \| \| \| \|	Reduce the amount of duplicate logging Akka does to stdout.	Stephen Haberman	2013-02-01	1	-0/+1
\|/ / / / / / / \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Given we have Akka logging go through SLF4j to log4j, we don't need all the extra noise of Akka's stdout logger that is supposedly only used during Akka init time but seems to continue logging lots of noisy network events that we either don't care about or are in the log4j logs anyway. See: http://doc.akka.io/docs/akka/2.0/general/configuration.html # Log level for the very basic logger activated during AkkaApplication startup # Options: ERROR, WARNING, INFO, DEBUG # stdout-loglevel = "WARNING"
* \| \| \| \| \| \|	Reduced the memory usage of reduce and similar operations	Matei Zaharia	2013-02-01	9	-46/+107
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	These operations used to wait for all the results to be available in an array on the driver program before merging them. They now merge values incrementally as they arrive.
* \| \| \| \| \| \|	Merge branch 'master' of github.com:mesos/spark	Matei Zaharia	2013-02-01	8	-59/+49
\|\ \ \ \ \ \ \
\| * \ \ \ \ \ \	Merge pull request #432 from stephenh/moreprivacy	Matei Zaharia	2013-02-01	8	-59/+49
\| \|\ \ \ \ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add more private declarations.
\| \| * \| \| \| \| \| \|	Add more private declarations.	Stephen Haberman	2013-01-31	8	-59/+49
\| \| \| \|/ / / / / \| \| \|/\| \| \| \| \|
* \| / \| \| \| \| \|	formatting	Matei Zaharia	2013-02-01	2	-3/+3
\|/ / / / / / /
* \| \| \| \| \| \|	Merge pull request #437 from stephenh/cancelmetacleaner	Matei Zaharia	2013-02-01	1	-0/+1
\|\ \ \ \ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Stop BlockManagers metadataCleaner.
\| * \| \| \| \| \| \|	Stop BlockManagers metadataCleaner.	Stephen Haberman	2013-02-01	1	-0/+1
\| \|/ / / / / /
* \| \| \| \| \| \|	Merge pull request #439 from JoshRosen/spark-580	Matei Zaharia	2013-02-01	2	-10/+9
\|\ \ \ \ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use spark.local.dir for PySpark temp files (SPARK-580).
\| * \| \| \| \| \| \|	Use spark.local.dir for PySpark temp files (SPARK-580).	Josh Rosen	2013-02-01	2	-10/+9
\|/ / / / / / /
* \| \| \| \| \| \|	Merge pull request #438 from JoshRosen/spark-674	Matei Zaharia	2013-02-01	4	-18/+25
\|\ \ \ \ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Do not launch JavaGateways on workers (SPARK-674).
\| * \| \| \| \| \| \|	Do not launch JavaGateways on workers (SPARK-674).	Josh Rosen	2013-02-01	4	-18/+25
\|/ / / / / / / \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The problem was that the gateway was being initialized whenever the pyspark.context module was loaded. The fix uses lazy initialization that occurs only when SparkContext instances are actually constructed. I also made the gateway and jvm variables private. This change results in ~3-4x performance improvement when running the PySpark unit tests.
* \| \| \| \| \| \|	Merge pull request #433 from rxin/master	Matei Zaharia	2013-02-01	2	-20/+24
\|\ \ \ \ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Changed PartitionPruningRDD's split to make sure it returns the correct split index.
\| * \| \| \| \| \| \|	Moved PruneDependency into PartitionPruningRDD.scala.	Reynold Xin	2013-02-01	2	-26/+22
\| \| \| \| \| \| \| \|
\| * \| \| \| \| \| \|	Removed the TODO comment from PartitionPruningRDD.	Reynold Xin	2013-01-31	1	-2/+0
\| \| \| \| \| \| \| \|
\| * \| \| \| \| \| \|	Changed PartitionPruningRDD's split to make sure it returns the correct	Reynold Xin	2013-01-31	2	-1/+11
\| \|/ / / / / / \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	split index.
* \| \| \| \| \| \|	Merge pull request #435 from JoshRosen/pyspark_stdout_fix	Matei Zaharia	2013-02-01	2	-2/+12
\|\ \ \ \ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix stdout redirection in PySpark.
\| * \| \| \| \| \| \|	Fix stdout redirection in PySpark.	Josh Rosen	2013-02-01	2	-2/+12
\|/ / / / / / /
* \| \| \| \| \| \|	Merge pull request #434 from pwendell/python-exceptions	Matei Zaharia	2013-01-31	2	-17/+32
\|\ \ \ \ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	SPARK-673: Capture and re-throw Python exceptions
\| * \| \| \| \| \| \|	Small fix from last commit	Patrick Wendell	2013-01-31	1	-1/+1
\| \| \| \| \| \| \| \|
\| * \| \| \| \| \| \|	Some style cleanup	Patrick Wendell	2013-01-31	1	-7/+4
\| \| \| \| \| \| \| \|
\| * \| \| \| \| \| \|	SPARK-673: Capture and re-throw Python exceptions	Patrick Wendell	2013-01-31	2	-16/+34
\| \|/ / / / / / \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch alters the Python <-> executor protocol to pass on exception data when they occur in user Python code.
* \| \| \| \| \| \|	Merge pull request #431 from mbautin/revert_default_profile	Matei Zaharia	2013-01-31	7	-77/+0
\|\ \ \ \ \ \ \ \| \|/ / / / / / \|/\| \| \| \| \| \|	Remove activation of profiles by default
\| * \| \| \| \| \|	Remove activation of profiles by default	Mikhail Bautin	2013-01-31	7	-77/+0
\|/ / / / / / \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	See the discussion at https://github.com/mesos/spark/pull/355 for why default profile activation is a problem.
* \| \| \| \| \|	Merge pull request #430 from pwendell/pyspark-guide	Matei Zaharia	2013-01-30	2	-2/+10
\|\ \ \ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Minor improvements to PySpark docs
\| * \| \| \| \| \|	Make module help available in python shell.	Patrick Wendell	2013-01-30	2	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Also, adds a line in doc explaining how to use.