spark - Mirror of Apache Spark

	Commit message (Collapse)	Author	Age	Files	Lines
*	Merge pull request #516 from squito/fix_local_metrics	Matei Zaharia	2013-03-15	8	-12/+104
\|\ \| \| \| \|	Fix local metrics
\| *	increase sleep time	Imran Rashid	2013-03-10	1	-1/+1
\| \|
\| *	add a small wait to one task to make sure some task runtime really is non-zero	Imran Rashid	2013-03-10	1	-4/+10
\| \|
\| *	enable task metrics in local mode, add tests	Imran Rashid	2013-03-09	2	-2/+88
\| \|
\| *	rename remoteFetchWaitTime to fetchWaitTime, since it also includes time ↵	Imran Rashid	2013-03-09	6	-10/+10
\| \| \| \| \| \| \| \|	from local fetches
* \|	Add a log4j compile dependency to fix build in IntelliJ	Mikhail Bautin	2013-03-15	1	-1/+5
\| \| \| \| \| \| \| \| \| \|	Also rename parent project to spark-parent (otherwise it shows up as "parent" in IntelliJ, which is very confusing).
* \|	Merge pull request #521 from stephenh/earlyclose	Matei Zaharia	2013-03-13	4	-51/+153
\|\ \ \| \| \| \| \| \|	Close the reader in HadoopRDD as soon as iteration end.
\| * \|	Add a test for NextIterator.	Stephen Haberman	2013-03-13	1	-0/+68
\| \| \|
\| * \|	Add NextIterator.closeIfNeeded.	Stephen Haberman	2013-03-13	2	-2/+16
\| \| \|
\| * \|	Remove NextIterator.close default implementation.	Stephen Haberman	2013-03-12	2	-4/+7
\| \| \|
\| * \|	More quickly call close in HadoopRDD.	Stephen Haberman	2013-03-11	3	-52/+69
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This also refactors out the common "gotNext" iterator pattern into a shared utility class.
* \| \|	Send block sizes as longs.	Charles Reiss	2013-03-11	1	-4/+4
\|/ /
* \|	Merge remote-tracking branch 'woggling/dag-sched-driver-port'	Matei Zaharia	2013-03-10	1	-5/+4
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \|	Conflicts: core/src/test/scala/spark/scheduler/DAGSchedulerSuite.scala
\| * \|	Prevent DAGSchedulerSuite from corrupting driver.port.	Charles Reiss	2013-03-09	1	-4/+5
\| \|/ \| \| \| \| \| \| \| \|	Use the LocalSparkContext abstraction to properly manage clearing spark.driver.port.
* \|	Merge pull request #512 from patelh/fix-kryo-serializer	Matei Zaharia	2013-03-10	2	-11/+19
\|\ \ \| \| \| \| \| \|	Fix reference bug in Kryo serializer, add test, update version
\| * \|	Fix reference bug in Kryo serializer, add test, update version	Hiral Patel	2013-03-07	2	-11/+19
\| \|/
* \|	Merge pull request #515 from woggling/deploy-app-death	Matei Zaharia	2013-03-10	3	-7/+17
\|\ \ \| \| \| \| \| \|	Notify standalone deploy client of application death.
\| * \|	Notify standalone deploy client of application death.	Charles Reiss	2013-03-09	3	-7/+17
\| \|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Usually, this isn't necessary since the application will be removed as a result of the deploy client disconnecting, but occassionally, the standalone deploy master removes an application otherwise. Also mark applications as FAILED instead of FINISHED when they are killed as a result of their executors failing too many times.
* \|	Merge remote-tracking branch 'stephenh/nomocks'	Matei Zaharia	2013-03-10	9	-536/+275
\|\ \ \| \|/ \|/\| \| \| \| \| \| \|	Conflicts: core/src/main/scala/spark/storage/BlockManagerMaster.scala core/src/test/scala/spark/scheduler/DAGSchedulerSuite.scala
\| *	Fix MapOutputTrackerSuite.	Stephen Haberman	2013-02-26	1	-2/+4
\| \|
\| *	Override DAGScheduler.runLocally so we can remove the Thread.sleep.	Stephen Haberman	2013-02-25	2	-19/+27
\| \|
\| *	Merge branch 'master' into nomocks	Stephen Haberman	2013-02-25	95	-890/+1411
\| \|\ \| \| \| \| \| \| \| \| \| \| \| \|	Conflicts: core/src/test/scala/spark/scheduler/DAGSchedulerSuite.scala
\| * \|	Use stubs instead of mocks for DAGSchedulerSuite.	Stephen Haberman	2013-02-09	8	-527/+253
\| \| \|
* \| \|	Fix TaskMetrics not being serializable	Matei Zaharia	2013-03-04	1	-6/+13
\| \| \|
* \| \|	Merge pull request #506 from rxin/spark-706	Matei Zaharia	2013-03-03	2	-51/+120
\|\ \ \ \| \| \| \| \| \| \| \|	Fixed SPARK-706: Failures in block manager put leads to read task hanging.
\| * \| \|	Fixed SPARK-706: Failures in block manager put leads to read task	Reynold Xin	2013-02-28	2	-51/+120
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	hanging.
* \| \| \|	minor cleanup based on feedback in review request	Imran Rashid	2013-03-03	4	-24/+24
\| \| \| \|
* \| \| \|	change CleanupIterator to CompletionIterator	Imran Rashid	2013-03-03	3	-27/+27
\| \| \| \|
* \| \| \|	refactoring of TaskMetrics	Imran Rashid	2013-03-03	6	-59/+110
\| \| \| \|
* \| \| \|	Merge branch 'master' into stageInfo	Imran Rashid	2013-03-03	29	-107/+474
\|\ \ \ \
\| * \ \ \	Merge pull request #504 from mosharaf/master	Matei Zaharia	2013-03-02	2	-2/+2
\| \|\ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \|	Worker address was getting removed when removing an app.
\| \| * \| \| \|	Fixed master datastructure updates after removing an application; and a typo.	Mosharaf Chowdhury	2013-02-27	2	-2/+2
\| \| \| \| \| \|
\| * \| \| \| \|	bump version to 0.7.1-SNAPSHOT in the subproject poms to keep the maven ↵	Mark Hamstra	2013-02-28	1	-1/+1
\| \|/ / / / \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	build building.
\| * \| \| \|	Fix a problem with no hosts being counted as alive in the first job	Matei Zaharia	2013-02-26	1	-3/+3
\| \| \| \| \|
\| * \| \| \|	Fix overly large thread names in PySpark	Matei Zaharia	2013-02-26	1	-2/+2
\| \| \| \| \|
\| * \| \| \|	Fixed replication bug in BlockManager	Tathagata Das	2013-02-25	2	-3/+17
\| \| \| \| \|
\| * \| \| \|	Allow passing sparkHome and JARs to StreamingContext constructor	Matei Zaharia	2013-02-25	2	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Also warns if spark.cleaner.ttl is not set in the version where you pass your own SparkContext.
\| * \| \| \|	Set spark.deploy.spreadOut to true by default in 0.7 (improves locality)	Matei Zaharia	2013-02-25	1	-1/+1
\| \| \| \| \|
\| * \| \| \|	Add a config property for Akka lifecycle event logging	Matei Zaharia	2013-02-25	1	-2/+4
\| \| \| \| \|
\| * \| \| \|	Merge pull request #498 from pwendell/shutup-akka	Matei Zaharia	2013-02-25	1	-1/+1
\| \|\ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \|	Disable remote lifecycle logging from Akka.
\| \| * \| \| \|	Disable remote lifecycle logging from Akka.	Patrick Wendell	2013-02-25	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This changes the default setting to `off` for remote lifecycle events. When this is on, it is very chatty at the INFO level. It also prints out several ERROR messages sometimes when sc.stop() is called.
\| * \| \| \| \|	Get spark.default.paralellism on each call to defaultPartitioner,	Matei Zaharia	2013-02-25	1	-4/+1
\| \| \|_\|_\|/ \| \|/\| \| \| \| \| \| \| \| \| \| \| \| \|	instead of only once, in case the user changes it across Spark uses
\| * \| \| \|	Merge pull request #459 from stephenh/bettersplits	Matei Zaharia	2013-02-25	8	-38/+90
\| \|\ \ \ \ \| \| \|/ / / \| \|/\| \| \|	Change defaultPartitioner to use upstream split size.
\| \| * \| \|	Use default parallelism if its set.	Stephen Haberman	2013-02-24	2	-6/+19
\| \| \| \| \|
\| \| * \| \|	Merge branch 'master' into bettersplits	Stephen Haberman	2013-02-24	92	-793/+1260
\| \| \|\ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Conflicts: core/src/main/scala/spark/RDD.scala core/src/main/scala/spark/scheduler/cluster/StandaloneSchedulerBackend.scala core/src/test/scala/spark/ShuffleSuite.scala
\| \| * \| \| \|	Update more javadocs.	Stephen Haberman	2013-02-16	2	-15/+17
\| \| \| \| \| \|
\| \| * \| \| \|	Tweak test names.	Stephen Haberman	2013-02-16	1	-2/+2
\| \| \| \| \| \|
\| \| * \| \| \|	Remove fileServerSuite.txt.	Stephen Haberman	2013-02-16	1	-1/+0
\| \| \| \| \| \|
\| \| * \| \| \|	Update default.parallelism docs, have StandaloneSchedulerBackend use it.	Stephen Haberman	2013-02-16	7	-24/+39
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Only brand new RDDs (e.g. parallelize and makeRDD) now use default parallelism, everything else uses their largest parent's partitioner or partition size.
\| \| * \| \| \|	Change defaultPartitioner to use upstream split size.	Stephen Haberman	2013-02-10	3	-6/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously it used the SparkContext.defaultParallelism, which occassionally ended up being a very bad guess. Looking at upstream RDDs seems to make better use of the context. Also sorted the upstream RDDs by partition size first, as if we have a hugely-partitioned RDD and tiny-partitioned RDD, it is unlikely we want the resulting RDD to be tiny-partitioned.