spark - Mirror of Apache Spark

	Commit message (Collapse)	Author	Age	Files	Lines
*	Merge pull request #367 from ankurdave/graphx	Patrick Wendell	2014-01-13	1	-4/+10
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	GraphX: Unifying Graphs and Tables GraphX extends Spark's distributed fault-tolerant collections API and interactive console with a new graph API which leverages recent advances in graph systems (e.g., [GraphLab](http://graphlab.org)) to enable users to easily and interactively build, transform, and reason about graph structured data at scale. See http://amplab.github.io/graphx/. Thanks to @jegonzal, @rxin, @ankurdave, @dcrankshaw, @jianpingjwang, @amatsukawa, @kellrott, and @adamnovak. Tasks left: - [x] Graph-level uncache - [x] Uncache previous iterations in Pregel - [x] ~~Uncache previous iterations in GraphLab~~ (postponed to post-release) - [x] - Describe GC issue with GraphLab - [ ] Write `docs/graphx-programming-guide.md` - [x] - Mention future Bagel support in docs - [ ] - Section on caching/uncaching in docs: As with Spark, cache something that is used more than once. In an iterative algorithm, try to cache and force (i.e., materialize) something every iteration, then uncache the cached things that depended on the newly materialized RDD but that won't be referenced again. - [x] Undo modifications to core collections and instead copy them to org.apache.spark.graphx - [x] Make Graph serializable to work around capture in Spark shell - [x] Rename graph -> graphx in package name and subproject - [x] Remove standalone PageRank - [x] ~~Fix amplab/graphx#52 by checking `iter.hasNext`~~
\| *	Merge branch 'master' into graphx	Reynold Xin	2014-01-13	1	-1/+2
\| \|\
\| * \|	graph -> graphx	Ankur Dave	2014-01-09	1	-6/+6
\| \| \|
\| * \|	Merge remote-tracking branch 'spark-upstream/master' into HEAD	Ankur Dave	2014-01-08	2	-89/+173
\| \|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Conflicts: README.md core/src/main/scala/org/apache/spark/util/collection/OpenHashMap.scala core/src/main/scala/org/apache/spark/util/collection/OpenHashSet.scala core/src/main/scala/org/apache/spark/util/collection/PrimitiveKeyOpenHashMap.scala pom.xml project/SparkBuild.scala repl/src/main/scala/org/apache/spark/repl/SparkILoop.scala
\| * \ \	Merge branch 'master' of github.com:apache/incubator-spark	Reynold Xin	2013-11-25	2	-4/+6
\| \|\ \ \
\| * \ \ \	Merge remote-tracking branch 'spark-upstream/master'	Ankur Dave	2013-10-30	1	-7/+22
\| \|\ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Conflicts: project/SparkBuild.scala
\| * \ \ \ \	Merge branch 'master' of https://github.com/apache/incubator-spark into ↵	Joseph E. Gonzalez	2013-10-18	1	-0/+1
\| \|\ \ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	indexedrdd_graphx
\| * \ \ \ \ \	merged with upstream changes	Joseph E. Gonzalez	2013-10-14	1	-7/+14
\| \|\ \ \ \ \ \
\| * \| \| \| \| \| \|	GraphX now builds with all merged changes.	Joseph E. Gonzalez	2013-09-17	1	-7/+9
\| \| \| \| \| \| \| \|
\| * \| \| \| \| \| \|	Merging latest changes from spark main branch	Joseph E. Gonzalez	2013-09-17	4	-74/+123
\| \|\ \ \ \ \ \ \
\| * \ \ \ \ \ \ \	Merged graphx from @rxin into master	Joseph E. Gonzalez	2013-08-06	1	-1/+5
\| \|\ \ \ \ \ \ \ \
\| \| * \ \ \ \ \ \ \	Merge branch 'master' of github.com:mesos/spark into graph	Reynold Xin	2013-06-29	2	-7/+26
\| \| \|\ \ \ \ \ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Conflicts: run run2.cmd
\| \| * \ \ \ \ \ \ \ \	Merge branch 'master' of github.com:mesos/spark into graph	Reynold Xin	2013-06-01	1	-2/+3
\| \| \|\ \ \ \ \ \ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Conflicts: run
\| \| * \ \ \ \ \ \ \ \ \	Merge branch 'master' of github.com:mesos/spark into graph	Reynold Xin	2013-05-02	3	-27/+71
\| \| \|\ \ \ \ \ \ \ \ \ \
\| \| * \| \| \| \| \| \| \| \| \| \|	Code to run bagel vs graph experiments.	Reynold Xin	2013-04-07	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|
\| \| * \| \| \| \| \| \| \| \| \| \|	Merge branch 'master' of github.com:mesos/spark into graph	Reynold Xin	2013-04-01	1	-1/+1
\| \| \|\ \ \ \ \ \ \ \ \ \ \
\| \| * \ \ \ \ \ \ \ \ \ \ \	Merge branch 'master' into graph	Reynold Xin	2013-03-18	1	-9/+11
\| \| \|\ \ \ \ \ \ \ \ \ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Conflicts: run2.cmd
\| \| * \ \ \ \ \ \ \ \ \ \ \ \	Merge branch 'master' into graph	Reynold Xin	2013-02-19	1	-1/+2
\| \| \|\ \ \ \ \ \ \ \ \ \ \ \ \
\| \| * \| \| \| \| \| \| \| \| \| \| \| \| \|	Maven and sbt build changes for SparkGraph.	Reynold Xin	2013-02-19	1	-1/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|
* \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Adjusted visibility of various components.	Reynold Xin	2014-01-13	1	-0/+7
\| \|_\|_\|_\|_\|_\|_\|_\|_\|_\|_\|_\|_\|_\|/ \|/\| \| \| \| \| \| \| \| \| \| \| \| \| \|
* \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Merge pull request #373 from jerryshao/kafka-upgrade	Patrick Wendell	2014-01-11	1	-9/+9
\|\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Upgrade Kafka dependecy to 0.8.0 release version
\| * \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Upgrade Kafka dependecy to 0.8.0 release version	jerryshao	2014-01-10	1	-9/+9
\| \| \|_\|_\|_\|_\|_\|_\|_\|_\|_\|_\|_\|_\|/ \| \|/\| \| \| \| \| \| \| \| \| \| \| \| \|
* \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Merge remote-tracking branch 'apache-github/master' into standalone-driver	Patrick Wendell	2014-01-08	2	-27/+67
\|\\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Conflicts: core/src/test/scala/org/apache/spark/deploy/JsonProtocolSuite.scala pom.xml
\| * \| \| \| \| \| \| \| \| \| \| \| \| \|	Merge pull request #313 from tdas/project-refactor	Patrick Wendell	2014-01-07	1	-26/+67
\| \|\ \ \ \ \ \ \ \ \ \ \ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Refactored the streaming project to separate external libraries like Twitter, Kafka, Flume, etc. At a high level, these are the following changes. 1. All the external code was put in `SPARK_HOME/external/` as separate SBT projects and Maven modules. Their artifact names are `spark-streaming-twitter`, `spark-streaming-kafka`, etc. Both SparkBuild.scala and pom.xml files have been updated. References to external libraries and repositories have been removed from the settings of root and streaming projects/modules. 2. To avail the external functionality (say, creating a Twitter stream), the developer has to `import org.apache.spark.streaming.twitter._` . For Scala API, the developer has to call `TwitterUtils.createStream(streamingContext, ...)`. For the Java API, the developer has to call `TwitterUtils.createStream(javaStreamingContext, ...)`. 3. Each external project has its own scala and java unit tests. Note the unit tests of each external library use classes of the streaming unit tests (`TestSuiteBase`, `LocalJavaStreamingContext`, etc.). To enable this code sharing among test classes, `dependsOn(streaming % "compile->compile,test->test")` was used in the SparkBuild.scala . In the streaming/pom.xml, an additional `maven-jar-plugin` was necessary to capture this dependency (see comment inside the pom.xml for more information). 4. Jars of the external projects have been added to examples project but not to the assembly project. 5. In some files, imports have been rearrange to conform to the Spark coding guidelines.
\| \| * \ \ \ \ \ \ \ \ \ \ \ \ \	Merge remote-tracking branch 'apache/master' into project-refactor	Tathagata Das	2014-01-06	1	-14/+40
\| \| \|\ \ \ \ \ \ \ \ \ \ \ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Conflicts: examples/src/main/java/org/apache/spark/streaming/examples/JavaFlumeEventCount.java streaming/src/main/scala/org/apache/spark/streaming/StreamingContext.scala streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaStreamingContext.scala streaming/src/test/java/org/apache/spark/streaming/JavaAPISuite.java streaming/src/test/scala/org/apache/spark/streaming/InputStreamsSuite.scala streaming/src/test/scala/org/apache/spark/streaming/TestSuiteBase.scala
\| \| * \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Added pom.xml for external projects and removed unnecessary dependencies and ↵	Tathagata Das	2013-12-31	1	-14/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	repositoris from other poms and sbt.
\| \| * \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Refactored kafka, flume, zeromq, mqtt as separate external projects, with ↵	Tathagata Das	2013-12-30	1	-25/+64
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	their own self-contained scala API, java API, scala unit tests and java unit tests. Updated examples to use the external projects.
\| \| * \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Refactored streaming project to separate out the twitter functionality.	Tathagata Das	2013-12-26	1	-2/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|
\| * \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Merge pull request #331 from holdenk/master	Reynold Xin	2014-01-07	1	-1/+0
\| \|\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add a script to download sbt if not present on the system As per the discussion on the dev mailing list this script will use the system sbt if present or otherwise attempt to install the sbt launcher. The fall back error message in the event it fails instructs the user to install sbt. While the URLs it fetches from aren't controlled by the spark project directly, they are stable and the current authoritative sources.
\| \| * \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use awk to extract the version	Holden Karau	2014-01-06	1	-3/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|
\| \| * \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	CR feedback (sbt -> sbt/sbt and correct JAR path in script) :)	Holden Karau	2014-01-05	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|
\| \| * \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add a script to download sbt if not present on the system	Holden Karau	2014-01-04	1	-0/+2
\| \| \| \|/ / / / / / / / / / / / / / \| \| \|/\| \| \| \| \| \| \| \| \| \| \| \| \| \|
* \| / \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Adding unit tests and some refactoring to promote testability.	Patrick Wendell	2014-01-07	1	-0/+1
\|/ / / / / / / / / / / / / / / /
* \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Merge pull request #340 from ScrapCodes/sbt-fixes	Patrick Wendell	2014-01-06	1	-5/+3
\|\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Made java options to be applied during tests so that they become self explanatory.
\| * \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Made java options to be applied during tests so that they become self ↵	Prashant Sharma	2014-01-06	1	-5/+3
\| \|/ / / / / / / / / / / / / / / \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	explanatory.
* / / / / / / / / / / / / / / /	SPARK-1005 Ning upgrade	Prashant Sharma	2014-01-06	1	-1/+1
\|/ / / / / / / / / / / / / / /
* \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Merge remote-tracking branch 'apache-github/master' into remove-binaries	Patrick Wendell	2014-01-03	1	-7/+25
\|\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Conflicts: core/src/test/scala/org/apache/spark/DriverSuite.scala docs/python-programming-guide.md
\| * \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Using name yarn-alpha/yarn instead of yarn-2.0/yarn-2.2	Raymond Liu	2014-01-03	1	-8/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|
\| * \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add yarn/common/src/test dir in building script	Raymond Liu	2014-01-03	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|
\| * \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use unmanaged source dir to include common yarn code	Raymond Liu	2014-01-03	1	-11/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|
\| * \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reorganize yarn related codes into sub projects to remove duplicate files.	Raymond Liu	2014-01-03	1	-8/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|
* \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Changes on top of Prashant's patch.	Patrick Wendell	2014-01-03	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Closes #316
* \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	fixed review comments	Prashant Sharma	2014-01-03	1	-5/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|
* \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Merge branch 'master' into spark-1002-remove-jars	Prashant Sharma	2014-01-03	1	-0/+1
\|\\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|
\| * \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Merge remote-tracking branch 'apache/master' into conf2	Matei Zaharia	2014-01-01	1	-1/+2
\| \|\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Conflicts: project/SparkBuild.scala
\| * \ \ \ \ \ \ \ \ \ \ \ \ \ \ \	Merge remote-tracking branch 'apache/master' into conf2	Matei Zaharia	2013-12-31	1	-1/+1
\| \|\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Conflicts: core/src/main/scala/org/apache/spark/rdd/CheckpointRDD.scala streaming/src/main/scala/org/apache/spark/streaming/Checkpoint.scala streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobGenerator.scala
\| * \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \	Merge remote-tracking branch 'origin/master' into conf2	Matei Zaharia	2013-12-29	1	-1/+4
\| \|\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \| \| \| \|_\|/ / / / / / / / / / / / / / \| \| \|/\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Conflicts: core/src/main/scala/org/apache/spark/SparkContext.scala core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala core/src/main/scala/org/apache/spark/scheduler/cluster/ClusterTaskSetManager.scala core/src/main/scala/org/apache/spark/scheduler/local/LocalScheduler.scala core/src/main/scala/org/apache/spark/util/MetadataCleaner.scala core/src/test/scala/org/apache/spark/scheduler/TaskResultGetterSuite.scala core/src/test/scala/org/apache/spark/scheduler/TaskSetManagerSuite.scala new-yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala streaming/src/main/scala/org/apache/spark/streaming/Checkpoint.scala streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaStreamingContext.scala streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobGenerator.scala streaming/src/test/scala/org/apache/spark/streaming/BasicOperationsSuite.scala streaming/src/test/scala/org/apache/spark/streaming/CheckpointSuite.scala streaming/src/test/scala/org/apache/spark/streaming/InputStreamsSuite.scala streaming/src/test/scala/org/apache/spark/streaming/TestSuiteBase.scala streaming/src/test/scala/org/apache/spark/streaming/WindowOperationsSuite.scala
\| * \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	spark-544, introducing SparkConf and related configuration overhaul.	Prashant Sharma	2013-12-25	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|
* \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Deleted py4j jar and added to assembly dependency	Prashant Sharma	2014-01-02	1	-0/+1
\| \|_\|_\|/ / / / / / / / / / / / / / \|/\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|
* \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Merge pull request #73 from falaki/ApproximateDistinctCount	Reynold Xin	2013-12-31	1	-1/+2
\|\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \| \|_\|_\|/ / / / / / / / / / / / / / \|/\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Approximate distinct count Added countApproxDistinct() to RDD and countApproxDistinctByKey() to PairRDDFunctions to approximately count distinct number of elements and distinct number of values per key, respectively. Both functions use HyperLogLog from stream-lib for counting. Both functions take a parameter that controls the trade-off between accuracy and memory consumption. Also added Scala docs and test suites for both methods.