spark - Mirror of Apache Spark

	Commit message (Collapse)	Author	Age	Files	Lines
*	Fixing SPARK-602: PythonPartitioner	Andre Schumacher	2013-10-04	6	-10/+44
\| \| \| \| \| \| \|	Currently PythonPartitioner determines partition ID by hashing a byte-array representation of PySpark's key. This PR lets PythonPartitioner use the actual partition ID, which is required e.g. for sorting via PySpark.
*	Merge pull request #26 from Du-Li/master	Matei Zaharia	2013-10-03	2	-1/+4
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	fixed a wildcard bug in make-distribution.sh; ask sbt to check local maven repo in project/SparkBuild.scala (1) fixed a wildcard bug in make-distribution.sh: with the wildcard * in quotes, this cp command failed. it worked after moving the wildcard out quotes. (2) ask sbt to check local maven repo in SparkBuild.scala: To build Spark (0.9.0-SNAPSHOT) with the HEAD of mesos (0.15.0), I must do "make maven-install" under mesos/build, which publishes the java .jar file under ~/.m2. However, when building Spark (after pointing mesos to version 0.15.0), sbt uses ivy which by default only checks ~/.ivy2. This change is to tell sbt to also check ~/.m2.
\| *	ask ivy/sbt to check local maven repo under ~/.m2	Du Li	2013-10-01	1	-0/+3
\| \|
\| *	fixed a bug of using wildcard in quotes	Du Li	2013-10-01	1	-1/+1
\| \|
* \|	Merge pull request #25 from CruncherBigData/master	Matei Zaharia	2013-10-03	1	-1/+1
\|\ \ \| \| \| \| \| \| \| \| \|	Update README: updated the link
\| * \|	Update README	CruncherBigData	2013-10-01	1	-1/+1
\| \|/
* \|	Merge pull request #28 from tgravescs/sparYarnAppName	Matei Zaharia	2013-10-03	3	-1/+8
\|\ \ \| \| \| \| \| \| \| \| \|	Allow users to set the application name for Spark on Yarn
\| * \|	Add default value to usage statement	tgravescs	2013-10-03	1	-1/+1
\| \| \|
\| * \|	Allow users to set the application name for Spark on Yarn	tgravescs	2013-10-02	3	-1/+8
\| \|/
* \|	Merge pull request #10 from kayousterhout/results_through-bm	Matei Zaharia	2013-10-02	19	-167/+496
\|\ \ \| \|/ \|/\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Send Task results through the block manager when larger than Akka frame size (fixes SPARK-669). This change requires adding an extra failure mode: tasks can complete successfully, but the result gets lost or flushed from the block manager before it's been fetched. This change also moves the deserialization of tasks into a separate thread, so it's no longer part of the DAG scheduler's tight loop. This should improve scheduler throughput, particularly when tasks are sending back large results. Thanks Josh for writing the original version of this patch! This is duplicated from the mesos/spark repo: https://github.com/mesos/spark/pull/835
\| *	Added additional unit test for repeated task failures	Kay Ousterhout	2013-09-30	1	-1/+28
\| \|
\| *	Fixed compilation errors and broken test.	Kay Ousterhout	2013-09-30	4	-13/+11
\| \|
\| *	Merge remote-tracking branch 'upstream/master' into results_through-bm	Kay Ousterhout	2013-09-30	60	-195/+368
\| \|\ \| \|/ \|/\| \| \| \| \| \| \| \| \|	Conflicts: core/src/main/scala/org/apache/spark/scheduler/cluster/ClusterScheduler.scala core/src/main/scala/org/apache/spark/scheduler/cluster/ClusterTaskSetManager.scala core/src/main/scala/org/apache/spark/scheduler/local/LocalTaskSetManager.scala
* \|	Merge pull request #17 from rxin/optimize	Reynold Xin	2013-09-26	2	-2/+1
\|\ \ \| \| \| \| \| \| \| \| \|	Remove -optimize flag
\| * \|	Removed scala -optimize flag.	Reynold Xin	2013-09-26	2	-2/+1
\| \| \|
* \| \|	Merge pull request #16 from pwendell/master	Reynold Xin	2013-09-26	1	-1/+1
\|\ \ \ \| \| \| \| \| \| \| \| \| \| \| \|	Bug fix in master build
\| * \| \|	Bug fix in master build	Patrick Wendell	2013-09-26	1	-1/+1
\| \| \| \|
* \| \| \|	Merge pull request #14 from kayousterhout/untangle_scheduler	Reynold Xin	2013-09-26	34	-71/+62
\|\ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Improved organization of scheduling packages. This commit does not change any code -- only file organization. Please let me know if there was some masterminded strategy behind the existing organization that I failed to understand! There are two components of this change: (1) Moving files out of the cluster package, and down a level to the scheduling package. These files are all used by the local scheduler in addition to the cluster scheduler(s), so should not be in the cluster package. As a result of this change, none of the files in the local package reference files in the cluster package. (2) Moving the mesos package to within the cluster package. The mesos scheduling code is for a cluster, and represents a specific case of cluster scheduling (the Mesos-related classes often subclass cluster scheduling classes). Thus, the most logical place for it seems to be within the cluster package. The one thing about the scheduling code that seems a little funny to me is the naming of the SchedulerBackends. The StandaloneSchedulerBackend is not just for Standalone mode, but instead is used by Mesos coarse grained mode and Yarn, and the backend that is just for Standalone mode is instead called SparkDeploySchedulerBackend. I didn't change this because I wasn't sure if there was a reason for this naming that I'm just not aware of.
\| * \| \| \|	Improved organization of scheduling packages.	Kay Ousterhout	2013-09-25	34	-71/+62
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit does not change any code -- only file organization. There are two components of this change: (1) Moving files out of the cluster package, and down a level to the scheduling package. These files are all used by the local scheduler in addition to the cluster scheduler(s), so should not be in the cluster package. As a result of this change, none of the files in the local package reference files in the cluster package. (2) Moving the mesos package to within the cluster package. The mesos scheduling code is for a cluster, and represents a specific case of cluster scheduling (the Mesos-related classes often subclass cluster scheduling classes). Thus, the most logical place for it is within the cluster package.
* \| \| \| \|	Merge pull request #670 from jey/ec2-ssh-improvements	Reynold Xin	2013-09-26	1	-26/+80
\|\ \ \ \ \ \| \|_\|_\|/ / \|/\| \| \| \|	EC2 SSH improvements
\| * \| \| \|	Clarify error messages on SSH failure	Jey Kottalam	2013-09-11	1	-6/+21
\| \| \| \| \|
\| * \| \| \|	Generate new SSH key for the cluster, make "--identity-file" optional	Jey Kottalam	2013-09-06	1	-21/+37
\| \| \| \| \|
\| * \| \| \|	Construct shell commands as sequences for safety and composability	Jey Kottalam	2013-09-06	1	-11/+34
\| \| \| \| \|
* \| \| \| \|	Merge pull request #930 from holdenk/master	Reynold Xin	2013-09-26	2	-1/+11
\|\ \ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \|	Add mapPartitionsWithIndex
\| * \| \| \| \|	Fix formatting :)	Holden Karau	2013-09-23	1	-4/+5
\| \| \| \| \| \|
\| * \| \| \| \|	Switch indent from 2 to 4 spaces	Holden Karau	2013-09-22	1	-2/+2
\| \| \| \| \| \|
\| * \| \| \| \|	Fix build on ubuntu	Holden Karau	2013-09-14	1	-1/+1
\| \| \| \| \| \|
\| * \| \| \| \|	Merge branch 'master' of https://github.com/mesos/spark	Holden Karau	2013-09-14	2	-5/+12
\| \|\ \ \ \ \
\| * \| \| \| \| \|	Make mapPartitionsWithIndex work with JavaRDD's	Holden Karau	2013-09-14	1	-2/+3
\| \| \| \| \| \| \|
\| * \| \| \| \| \|	Start of working on SPARK-615	Holden Karau	2013-09-11	1	-0/+8
\| \| \| \| \| \| \|
* \| \| \| \| \| \|	Merge pull request #7 from wannabeast/memorystore-fixes	Reynold Xin	2013-09-26	1	-6/+8
\|\ \ \ \ \ \ \ \| \|_\|_\|_\|_\|/ / \|/\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	some minor fixes to MemoryStore This is a repeat of #5, moved to its own branch in my repo. This makes all updates to on ; it skips on synchronizing the reads where it can get away with it.
\| * \| \| \| \| \|	Synchronize on "entries" the remaining update to "currentMemory".	Mike	2013-09-19	1	-3/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make "currentMemory" @volatile, so that it's reads in ensureFreeSpace() are atomic and up-to-date--i.e., currentMemory can't increase while putLock is held (though it could decrease, which would only help ensureFreeSpace()).
\| * \| \| \| \| \|	Set currentMemory to 0 in clear().	Mike	2013-09-11	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Remove unnecessary entries.get() call.
\| * \| \| \| \| \|	Remove MemoryStore$Entry.dropPending, unused as of 42e0a68082.	Mike	2013-09-10	1	-1/+1
\| \| \| \| \| \| \|
* \| \| \| \| \| \|	Merge pull request #9 from rxin/limit	Patrick Wendell	2013-09-26	2	-10/+66
\|\ \ \ \ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Smarter take/limit implementation.
\| * \| \| \| \| \| \|	Smarter take/limit implementation.	Reynold Xin	2013-09-20	2	-10/+66
\| \| \| \| \| \| \| \|
* \| \| \| \| \| \| \|	Merge remote-tracking branch 'apache-github/pr/13' into HEAD	Patrick Wendell	2013-09-24	14	-15/+15
\|\ \ \ \ \ \ \ \ \| \|_\|_\|_\|_\|_\|/ / \|/\| \| \| \| \| \| \|
\| * \| \| \| \| \| \|	Update build version in master	Patrick Wendell	2013-09-24	14	-15/+15
\|/ / / / / / /
* \| \| \| \| \| \|	Merge remote-tracking branch 'pr/12'	Reynold Xin	2013-09-23	2	-4/+6
\|\ \ \ \ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix spacing so java.io.tmpdir doesn't run on with SPARK_JAVA_OPTS
\| * \| \| \| \| \| \|	Fix spacing so that the java.io.tmpdir doesn't run on with SPARK_JAVA_OPTS	Y.CORP.YAHOO.COM\tgraves	2013-09-23	2	-4/+6
\| \|/ / / / / /
* \| \| \| \| \| \|	Merge branch 'master' of https://git-wip-us.apache.org/repos/asf/incubator-spark	Reynold Xin	2013-09-23	0	-0/+0
\|\\| \| \| \| \| \|
* \| \| \| \| \| \|	Merge branch 'master' of github.com:markhamstra/incubator-spark	Reynold Xin	2013-09-23	1	-1/+0
\|\ \ \ \ \ \ \
\| * \| \| \| \| \| \|	Removed repetative import; fixes hidden definition compiler warning.	Mark Hamstra	2013-09-03	1	-1/+0
\| \| \|/ / / / / \| \|/\| \| \| \| \|
* \| \| \| \| \| \|	Merge branch 'master' of github.com:mesos/spark	Reynold Xin	2013-09-23	7	-64/+123
\|\ \ \ \ \ \ \
\| * \ \ \ \ \ \	Merge pull request #928 from jerryshao/fairscheduler-refactor	Reynold Xin	2013-09-22	1	-43/+56
\| \|\ \ \ \ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Refactor FairSchedulableBuilder
\| \| * \| \| \| \| \| \|	Change Exception to NoSuchElementException and minor style fix	jerryshao	2013-09-22	1	-6/+7
\| \| \| \| \| \| \| \| \|
\| \| * \| \| \| \| \| \|	Remove infix style and others	jerryshao	2013-09-22	1	-10/+8
\| \| \| \| \| \| \| \| \|
\| \| * \| \| \| \| \| \|	Refactor FairSchedulableBuilder:	jerryshao	2013-09-22	1	-39/+53
\| \|/ / / / / / / \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	1. Configuration can be read from classpath if not set explicitly. 2. Add missing close handler.
\| * \| \| \| \| \| \|	Merge pull request #937 from jerryshao/localProperties-fix	Reynold Xin	2013-09-21	2	-2/+50
\| \|\ \ \ \ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix PR926 local properties issues in Spark Streaming like scenarios
\| \| * \| \| \| \| \| \|	Add barrier for local properties unit test and fix some styles	jerryshao	2013-09-22	2	-3/+11
\| \| \| \| \| \| \| \| \|