spark - Mirror of Apache Spark

	Commit message (Collapse)	Author	Age	Files	Lines
*	Update build version in master	Patrick Wendell	2013-09-24	1	-1/+1
\|
*	Merge branch 'master' of github.com:markhamstra/incubator-spark	Reynold Xin	2013-09-23	1	-1/+0
\|\
\| *	Removed repetative import; fixes hidden definition compiler warning.	Mark Hamstra	2013-09-03	1	-1/+0
\| \|
* \|	Change Exception to NoSuchElementException and minor style fix	jerryshao	2013-09-22	1	-6/+7
\| \|
* \|	Remove infix style and others	jerryshao	2013-09-22	1	-10/+8
\| \|
* \|	Refactor FairSchedulableBuilder:	jerryshao	2013-09-22	1	-39/+53
\| \| \| \| \| \| \| \| \| \|	1. Configuration can be read from classpath if not set explicitly. 2. Add missing close handler.
* \|	Merge pull request #937 from jerryshao/localProperties-fix	Reynold Xin	2013-09-21	2	-2/+50
\|\ \ \| \| \| \| \| \|	Fix PR926 local properties issues in Spark Streaming like scenarios
\| * \|	Add barrier for local properties unit test and fix some styles	jerryshao	2013-09-22	2	-3/+11
\| \| \|
\| * \|	Fix issue when local properties pass from parent to child thread	jerryshao	2013-09-18	2	-2/+42
\| \| \|
* \| \|	After unit tests, clear port properties unconditionally	Ankur Dave	2013-09-19	2	-9/+7
\|/ / \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In MapOutputTrackerSuite, the "remote fetch" test sets spark.driver.port and spark.hostPort, assuming that they will be cleared by LocalSparkContext. However, the test never sets sc, so it remains null, causing LocalSparkContext to skip clearing these properties. Subsequent tests therefore fail with java.net.BindException: "Address already in use". This commit makes LocalSparkContext clear the properties even if sc is null.
* \|	Changed localProperties to use ThreadLocal (not DynamicVariable).	Kay Ousterhout	2013-09-11	1	-9/+9
\| \| \| \| \| \| \| \| \| \| \| \|	The fact that DynamicVariable uses an InheritableThreadLocal can cause problems where the properties end up being shared across threads in certain circumstances.
* \|	Merge pull request #919 from mateiz/jets3t	Patrick Wendell	2013-09-11	1	-0/+5
\|\ \ \| \| \| \| \| \|	Add explicit jets3t dependency, which is excluded in hadoop-client
\| * \|	Add explicit jets3t dependency, which is excluded in hadoop-client	Matei Zaharia	2013-09-10	1	-0/+5
\| \| \|
* \| \|	Merge pull request #922 from pwendell/port-change	Patrick Wendell	2013-09-11	2	-2/+2
\|\ \ \ \| \| \| \| \| \| \| \|	Change default port number from 3030 to 4030.
\| * \| \|	Change port from 3030 to 4040	Patrick Wendell	2013-09-11	2	-2/+2
\| \|/ /
* / /	SPARK-894 - Not all WebUI fields delivered VIA JSON	David McCauley	2013-09-11	1	-1/+3
\|/ /
* \|	Merge pull request #915 from ooyala/master	Matei Zaharia	2013-09-09	1	-1/+9
\|\ \ \| \| \| \| \| \|	Get rid of / improve ugly NPE when Utils.deleteRecursively() fails
\| * \|	Style fix: put body of if within curly braces	Evan Chan	2013-09-09	1	-1/+3
\| \| \|
\| * \|	Print out more friendly error if listFiles() fails	Evan Chan	2013-09-09	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \|	listFiles() could return null if the I/O fails, and this currently results in an ugly NPE which is hard to diagnose.
* \| \|	Merge pull request #907 from stephenh/document_coalesce_shuffle	Matei Zaharia	2013-09-09	2	-4/+27
\|\ \ \ \| \| \| \| \| \| \| \|	Add better docs for coalesce.
\| * \| \|	Use a set since shuffle could change order.	Stephen Haberman	2013-09-09	1	-1/+1
\| \| \| \|
\| * \| \|	Reword 'evenly distributed' to 'distributed with a hash partitioner.	Stephen Haberman	2013-09-09	1	-2/+2
\| \| \| \|
\| * \| \|	Add better docs for coalesce.	Stephen Haberman	2013-09-08	2	-4/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Include the useful tip that if shuffle=true, coalesce can actually increase the number of partitions. This makes coalesce more like a generic `RDD.repartition` operation. (Ideally this `RDD.repartition` could automatically choose either a coalesce or a shuffle if numPartitions was either less than or greater than, respectively, the current number of partitions.)
* \| \| \|	Add metrics-ganglia to core pom file	Y.CORP.YAHOO.COM\tgraves	2013-09-09	1	-0/+4
\| \| \| \|
* \| \| \|	Merge pull request #890 from mridulm/master	Matei Zaharia	2013-09-08	3	-2/+17
\|\ \ \ \ \| \| \| \| \| \| \| \| \| \|	Fix hash bug
\| * \| \| \|	Address review comments - rename toHash to nonNegativeHash	Mridul Muralidharan	2013-09-04	3	-3/+3
\| \| \| \| \|
\| * \| \| \|	Fix hash bug - caused failure after 35k stages, sigh	Mridul Muralidharan	2013-09-04	3	-2/+17
\| \| \| \| \|
* \| \| \| \|	Merge pull request #909 from mateiz/exec-id-fix	Reynold Xin	2013-09-08	2	-7/+7
\|\ \ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \|	Fix an instance where full standalone mode executor IDs were passed to
\| * \| \| \| \|	Fix an instance where full standalone mode executor IDs were passed to	Matei Zaharia	2013-09-08	2	-7/+7
\| \| \|/ / / \| \|/\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	StandaloneSchedulerBackend instead of the smaller IDs used within Spark (that lack the application name). This was reported by ClearStory in https://github.com/clearstorydata/spark/pull/9. Also fixed some messages that said slave instead of executor.
* \| \| \| \|	Merge pull request #905 from mateiz/docs2	Matei Zaharia	2013-09-08	8	-18/+19
\|\ \ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \|	Job scheduling and cluster mode docs
\| * \| \| \| \|	Fix unit test failure due to changed default	Matei Zaharia	2013-09-08	1	-1/+1
\| \| \| \| \| \|
\| * \| \| \| \|	More fair scheduler docs and property names.	Matei Zaharia	2013-09-08	6	-12/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Also changed uses of "job" terminology to "application" when they referred to an entire Spark program, to avoid confusion.
\| * \| \| \| \|	Work in progress:	Matei Zaharia	2013-09-08	4	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- Add job scheduling docs - Rename some fair scheduler properties - Organize intro page better - Link to Apache wiki for "contributing to Spark"
* \| \| \| \| \|	Merge pull request #906 from pwendell/ganglia-sink	Patrick Wendell	2013-09-08	9	-28/+114
\|\ \ \ \ \ \ \| \|_\|/ / / / \|/\| \| \| \| \|	Clean-up of Metrics Code/Docs and Add Ganglia Sink
\| * \| \| \| \|	Adding sc name in metrics source	Patrick Wendell	2013-09-08	5	-9/+14
\| \| \| \| \| \|
\| * \| \| \| \|	Adding more docs and some code cleanup	Patrick Wendell	2013-09-08	3	-19/+18
\| \| \| \| \| \|
\| * \| \| \| \|	Ganglia sink	Patrick Wendell	2013-09-08	1	-0/+82
\| \| \|/ / / \| \|/\| \| \|
* \| \| \| \|	Merge pull request #898 from ilikerps/660	Matei Zaharia	2013-09-08	1	-3/+3
\|\ \ \ \ \ \| \|_\|/ / / \|/\| \| \| \|	SPARK-660: Add StorageLevel support in Python
\| * \| \| \|	Export StorageLevel and refactor	Aaron Davidson	2013-09-07	1	-3/+3
\| \| \| \| \|
\| * \| \| \|	Remove reflection, hard-code StorageLevels	Aaron Davidson	2013-09-07	1	-11/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The sc.StorageLevel -> StorageLevel pathway is a bit janky, but otherwise the shell would have to call a private method of SparkContext. Having StorageLevel available in sc also doesn't seem like the end of the world. There may be a better solution, though. As for creating the StorageLevel object itself, this seems to be the best way in Python 2 for creating singleton, enum-like objects: http://stackoverflow.com/questions/36932/how-can-i-represent-an-enum-in-python
\| * \| \| \|	Memoize StorageLevels read from JVM	Aaron Davidson	2013-09-06	1	-1/+1
\| \| \| \| \|
\| * \| \| \|	SPARK-660: Add StorageLevel support in Python	Aaron Davidson	2013-09-05	1	-0/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It uses reflection... I am not proud of that fact, but it at least ensures compatibility (sans refactoring of the StorageLevel stuff).
* \| \| \| \|	Fixed the bug that ResultTask was not properly deserializing outputId.	Reynold Xin	2013-09-07	1	-2/+2
\| \|_\|/ / \|/\| \| \|
* \| \| \|	Hot fix to resolve the compilation error caused by SPARK-821.	Reynold Xin	2013-09-06	1	-1/+1
\| \| \| \|
* \| \| \|	Merge pull request #895 from ilikerps/821	Patrick Wendell	2013-09-05	7	-7/+102
\|\ \ \ \ \| \| \| \| \| \| \| \| \| \|	SPARK-821: Don't cache results when action run locally on driver
\| * \| \| \|	Reynold's second round of comments	Aaron Davidson	2013-09-05	2	-17/+19
\| \| \| \| \|
\| * \| \| \|	Add unit test and address comments	Aaron Davidson	2013-09-05	5	-6/+98
\| \| \| \| \|
\| * \| \| \|	SPARK-821: Don't cache results when action run locally on driver	Aaron Davidson	2013-09-05	4	-4/+5
\| \|/ / / \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Caching the results of local actions (e.g., rdd.first()) causes the driver to store entire partitions in its own memory, which may be highly constrained. This patch simply makes the CacheManager avoid caching the result of all locally-run computations.
* \| \| \|	Merge pull request #891 from xiajunluan/SPARK-864	Matei Zaharia	2013-09-05	1	-1/+8
\|\ \ \ \ \| \| \| \| \| \| \| \| \| \|	[SPARK-864]DAGScheduler Exception if we delete Worker and StandaloneExecutorBackend then add Worker
\| * \| \| \|	Fix bug SPARK-864	Andrew xia	2013-09-05	1	-1/+8
\| \| \|_\|/ \| \|/\| \|