spark - Mirror of Apache Spark

	Commit message (Collapse)	Author	Age	Files	Lines
*	Merge pull request #19 from aarondav/master-zk	Matei Zaharia	2013-10-10	1	-0/+11
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Standalone Scheduler fault tolerance using ZooKeeper This patch implements full distributed fault tolerance for standalone scheduler Masters. There is only one master Leader at a time, which is actively serving scheduling requests. If this Leader crashes, another master will eventually be elected, reconstruct the state from the first Master, and continue serving scheduling requests. Leader election is performed using the ZooKeeper leader election pattern. We try to minimize the use of ZooKeeper and the assumptions about ZooKeeper's behavior, so there is a layer of retries and session monitoring on top of the ZooKeeper client. Master failover follows directly from the single-node Master recovery via the file system (patch d5a96fe), save that the Master state is stored in ZooKeeper instead. Configuration: By default, no recovery mechanism is enabled (spark.deploy.recoveryMode = NONE). By setting spark.deploy.recoveryMode to ZOOKEEPER and setting spark.deploy.zookeeper.url to an appropriate ZooKeeper URL, ZooKeeper recovery mode is enabled. By setting spark.deploy.recoveryMode to FILESYSTEM and setting spark.deploy.recoveryDirectory to an appropriate directory accessible by the Master, we will keep the behavior of from d5a96fe. Additionally, places where a Master could be specificied by a spark:// url can now take comma-delimited lists to specify backup masters. Note that this is only used for registration of NEW Workers and application Clients. Once a Worker or Client has registered with the Master Leader, it is "in the system" and will never need to register again.
\| *	Standalone Scheduler fault tolerance using ZooKeeper	Aaron Davidson	2013-09-26	1	-0/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch implements full distributed fault tolerance for standalone scheduler Masters. There is only one master Leader at a time, which is actively serving scheduling requests. If this Leader crashes, another master will eventually be elected, reconstruct the state from the first Master, and continue serving scheduling requests. Leader election is performed using the ZooKeeper leader election pattern. We try to minimize the use of ZooKeeper and the assumptions about ZooKeeper's behavior, so there is a layer of retries and session monitoring on top of the ZooKeeper client. Master failover follows directly from the single-node Master recovery via the file system (patch 194ba4b8), save that the Master state is stored in ZooKeeper instead. Configuration: By default, no recovery mechanism is enabled (spark.deploy.recoveryMode = NONE). By setting spark.deploy.recoveryMode to ZOOKEEPER and setting spark.deploy.zookeeper.url to an appropriate ZooKeeper URL, ZooKeeper recovery mode is enabled. By setting spark.deploy.recoveryMode to FILESYSTEM and setting spark.deploy.recoveryDirectory to an appropriate directory accessible by the Master, we will keep the behavior of from 194ba4b8. Additionally, places where a Master could be specificied by a spark:// url can now take comma-delimited lists to specify backup masters. Note that this is only used for registration of NEW Workers and application Clients. Once a Worker or Client has registered with the Master Leader, it is "in the system" and will never need to register again. Forthcoming: Documentation, tests (! - only ad hoc testing has been performed so far) I do not intend for this commit to be merged until tests are added, but this patch should still be mostly reviewable until then.
* \|	Merging build changes in from 0.8	Patrick Wendell	2013-10-05	1	-3/+4
\| \|
* \|	Removed scala -optimize flag.	Reynold Xin	2013-09-26	1	-1/+0
\|/
*	Update build version in master	Patrick Wendell	2013-09-24	1	-1/+1
\|
*	Bumping Mesos version to 0.13.0	Patrick Wendell	2013-09-15	1	-1/+1
\|
*	Use different Hadoop version for YARN artifacts.	Patrick Wendell	2013-09-13	1	-5/+6
\| \| \| \| \|	This uses a seperate Hadoop version for YARN artifact. This means when people link against spark-yarn, things will resolve correctly.
*	Add git scm url for publishing	Patrick Wendell	2013-09-12	1	-0/+1
\|
*	Add explicit jets3t dependency, which is excluded in hadoop-client	Matei Zaharia	2013-09-10	1	-0/+5
\|
*	Merge pull request #911 from pwendell/ganglia-sink	Matei Zaharia	2013-09-09	1	-0/+5
\|\ \| \| \| \|	Adding Manen dependency for Ganglia
\| *	Adding Manen dependency	Patrick Wendell	2013-09-09	1	-0/+5
\| \|
* \|	Fix YARN assembly generation under Maven	Jey Kottalam	2013-09-06	1	-125/+93
\|/
*	Add Apache parent POM	Matei Zaharia	2013-09-02	1	-0/+5
\|
*	Fix some URLs	Matei Zaharia	2013-09-01	1	-2/+2
\|
*	Initial work to rename package to org.apache.spark	Matei Zaharia	2013-09-01	1	-7/+7
\|
*	Update Maven build to create assemblies expected by new scripts	Matei Zaharia	2013-08-29	1	-13/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This includes the following changes: - The "assembly" package now builds in Maven by default, and creates an assembly containing both hadoop-client and Spark, unlike the old BigTop distribution assembly that skipped hadoop-client - There is now a bigtop-dist package to build the old BigTop assembly - The repl-bin package is no longer built by default since the scripts don't reply on it; instead it can be enabled with -Prepl-bin - Py4J is now included in the assembly/lib folder as a local Maven repo, so that the Maven package can link to it - run-example now adds the original Spark classpath as well because the Maven examples assembly lists spark-core and such as provided - The various Maven projects add a spark-yarn dependency correctly
*	Provide more memory for tests	Matei Zaharia	2013-08-29	1	-1/+1
\|
*	Revert "Merge pull request #841 from rxin/json"	Reynold Xin	2013-08-26	1	-0/+5
\| \| \| \| \|	This reverts commit 1fb1b0992838c8cdd57eec45793e67a0490f1a52, reversing changes made to c69c48947d5102c81a9425cb380d861c3903685c.
*	Merge pull request #855 from jey/update-build-docs	Matei Zaharia	2013-08-22	1	-6/+6
\|\ \| \| \| \|	Update build docs
\| *	Use "hadoop.version" property when specifying Hadoop YARN version too	Jey Kottalam	2013-08-21	1	-6/+6
\| \|
\| *	Downgraded default build hadoop version to 1.0.4.	Reynold Xin	2013-08-21	1	-1/+1
\| \|
* \|	Synced sbt and maven builds	Mark Hamstra	2013-08-21	1	-5/+11
\|/
*	Merge remote-tracking branch 'jey/hadoop-agnostic'	Matei Zaharia	2013-08-20	1	-66/+158
\|\ \| \| \| \| \| \| \| \|	Conflicts: core/src/main/scala/spark/PairRDDFunctions.scala
\| *	Fix Maven build with Hadoop 0.23.9	Jey Kottalam	2013-08-18	1	-11/+0
\| \|
\| *	Maven build now also works with YARN	Jey Kottalam	2013-08-16	1	-0/+128
\| \|
\| *	Maven build now works with CDH hadoop-2.0.0-mr1	Jey Kottalam	2013-08-16	1	-35/+20
\| \|
\| *	Initial changes to make Maven build agnostic of hadoop version	Jey Kottalam	2013-08-16	1	-21/+11
\| \|
\| *	Update default version of Hadoop to 1.2.1	Jey Kottalam	2013-08-15	1	-1/+1
\| \|
* \|	Use the JSON formatter from Scala library and removed dependency on lift-json.	Reynold Xin	2013-08-15	1	-5/+0
\|/ \| \| \|	It made the JSON creation slightly more complicated, but reduces one external dependency. The scala library also properly escape "/" (which lift-json doesn't).
*	Update to Mesos 0.12.1	Matei Zaharia	2013-08-13	1	-1/+1
\|
*	Merge pull request #784 from jerryshao/dev-metrics-servlet	Patrick Wendell	2013-08-13	1	-0/+5
\|\ \| \| \| \|	Add MetricsServlet for Spark metrics system
\| *	Add MetricsServlet for Spark metrics system	jerryshao	2013-08-12	1	-0/+5
\| \|
* \|	Changed yarn.version to 2.0.5 in pom.xml	Alexander Pivovarov	2013-08-10	1	-1/+1
\|/
*	Update to Chill 0.3.1	Matei Zaharia	2013-08-08	1	-2/+2
\|
*	Merge pull request #753 from shivaram/glm-refactor	Matei Zaharia	2013-07-31	1	-0/+1
\|\ \| \| \| \|	Build changes for ML lib
\| *	Add bagel, mllib to SBT assembly.	Shivaram Venkataraman	2013-07-30	1	-0/+1
\| \| \| \| \| \| \| \|	Also add jblas dependency to mllib pom.xml
* \|	Added Snappy dependency to Maven build files.	Reynold Xin	2013-07-30	1	-0/+5
\|/
*	Fix Chill version in Maven	Matei Zaharia	2013-07-25	1	-1/+1
\|
*	update pom.xml	ryanlecompte	2013-07-24	1	-3/+8
\|
*	Fix Maven build errors after previous commits	Matei Zaharia	2013-07-24	1	-8/+10
\|
*	Merge pull request #675 from c0s/assembly	Matei Zaharia	2013-07-24	1	-1/+20
\|\ \| \| \| \|	Building spark assembly for further consumption of the Spark project with a deployed cluster
\| *	Building spark assembly for further consumption of the Spark project with a ↵	Konstantin Boudnik	2013-07-21	1	-1/+20
\| \| \| \| \| \| \| \|	deployed cluster
* \|	Add Maven metrics library dependency and code changes	jerryshao	2013-07-24	1	-0/+8
\| \|
* \|	Add JavaAPICompletenessChecker.	Josh Rosen	2013-07-22	1	-0/+1
\|/ \| \| \| \| \| \| \| \| \| \|	This is used to find methods in the Scala API that need to be ported to the Java API. To use it: ./run spark.tools.JavaAPICompletenessChecker Conflicts: project/SparkBuild.scala run run2.cmd
*	Add Apache license headers and LICENSE and NOTICE files	Matei Zaharia	2013-07-16	1	-0/+17
\|
*	Update to latest Scala Maven plugin and allow Zinc external compiler	Matei Zaharia	2013-07-16	1	-1/+3
\|
*	pom cleanup	Mark Hamstra	2013-07-08	1	-0/+10
\|
*	Fix some other references to Cloudera Avro and updated Avro version	Matei Zaharia	2013-07-06	1	-4/+4
\|
*	Merge pull request #676 from c0s/asf-avro	Matei Zaharia	2013-07-06	1	-2/+2
\|\ \| \| \| \|	Use standard ASF published avro module instead of a proprietory built one
\| *	Use standard ASF published avro module instead of a proprietory built one	Konstantin Boudnik	2013-07-04	1	-2/+2
\| \|