spark - Mirror of Apache Spark

	Commit message (Collapse)	Author	Age	Files	Lines
*	Merge pull request #97 from ewencp/pyspark-system-properties	Matei Zaharia	2013-10-22	1	-0/+11
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add classmethod to SparkContext to set system properties. Add a new classmethod to SparkContext to set system properties like is possible in Scala/Java. Unlike the Java/Scala implementations, there's no access to System until the JVM bridge is created. Since SparkContext handles that, move the initialization of the JVM connection to a separate classmethod that can safely be called repeatedly as long as the same instance (or no instance) is provided.
\| *	Add notes to python documentation about using SparkContext.setSystemProperty.	Ewen Cheslack-Postava	2013-10-22	1	-0/+11
\| \|
* \|	Docs: Fix links to RDD API documentation	Aaron Davidson	2013-10-22	1	-3/+3
\|/
*	Merge pull request #76 from pwendell/master	Reynold Xin	2013-10-18	1	-1/+1
\|\ \| \| \| \| \| \| \| \| \| \| \| \|	Clarify compression property. Clarifies that this governs compression of internal data, not input data or output data.
\| *	Clarify compression property.	Patrick Wendell	2013-10-18	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	Clarifies that this governs compression of internal data, not input data or output data.
* \|	Code styling. Updated doc.	Mosharaf Chowdhury	2013-10-17	1	-0/+8
\|/
*	Merge remote-tracking branch 'tgravescs/sparkYarnDistCache'	Matei Zaharia	2013-10-10	1	-1/+8
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Closes #11 Conflicts: docs/running-on-yarn.md yarn/src/main/scala/org/apache/spark/deploy/yarn/ClientArguments.scala
\| *	Adding in the --addJars option to make SparkContext.addJar work on yarn and ↵	tgravescs	2013-10-03	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \|	cleanup the classpaths
\| *	Support distributed cache files and archives on spark on yarn and attempt to ↵	Y.CORP.YAHOO.COM\tgraves	2013-09-23	1	-1/+6
\| \| \| \| \| \| \| \|	cleanup the staging directory on exit
* \|	Merge pull request #19 from aarondav/master-zk	Matei Zaharia	2013-10-10	3	-4/+78
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Standalone Scheduler fault tolerance using ZooKeeper This patch implements full distributed fault tolerance for standalone scheduler Masters. There is only one master Leader at a time, which is actively serving scheduling requests. If this Leader crashes, another master will eventually be elected, reconstruct the state from the first Master, and continue serving scheduling requests. Leader election is performed using the ZooKeeper leader election pattern. We try to minimize the use of ZooKeeper and the assumptions about ZooKeeper's behavior, so there is a layer of retries and session monitoring on top of the ZooKeeper client. Master failover follows directly from the single-node Master recovery via the file system (patch d5a96fe), save that the Master state is stored in ZooKeeper instead. Configuration: By default, no recovery mechanism is enabled (spark.deploy.recoveryMode = NONE). By setting spark.deploy.recoveryMode to ZOOKEEPER and setting spark.deploy.zookeeper.url to an appropriate ZooKeeper URL, ZooKeeper recovery mode is enabled. By setting spark.deploy.recoveryMode to FILESYSTEM and setting spark.deploy.recoveryDirectory to an appropriate directory accessible by the Master, we will keep the behavior of from d5a96fe. Additionally, places where a Master could be specificied by a spark:// url can now take comma-delimited lists to specify backup masters. Note that this is only used for registration of NEW Workers and application Clients. Once a Worker or Client has registered with the Master Leader, it is "in the system" and will never need to register again.
\| * \|	Minor clarification and cleanup to spark-standalone.md	Aaron Davidson	2013-10-10	1	-10/+33
\| \| \|
\| * \|	Address Matei's comments on documentation	Aaron Davidson	2013-10-10	1	-14/+21
\| \| \| \| \| \| \| \| \| \| \| \|	Updates to the documentation and changing some logError()s to logWarning()s.
\| * \|	Add docs for standalone scheduler fault tolerance	Aaron Davidson	2013-10-08	3	-4/+48
\| \| \| \| \| \| \| \| \| \| \| \|	Also fix a couple HTML/Markdown issues in other files.
* \| \|	Fix PySpark docs and an overly long line of code after fdbae41e	Matei Zaharia	2013-10-09	1	-1/+1
\| \| \|
* \| \|	Merge branch 'master' into implicit-als	Nick Pentreath	2013-10-07	1	-2/+2
\|\ \ \
\| * \| \|	Merging build changes in from 0.8	Patrick Wendell	2013-10-05	1	-2/+2
\| \| \| \|
* \| \| \|	Adding implicit feedback ALS to MLlib user guide	Nick Pentreath	2013-10-04	1	-4/+20
\|/ / /
* / /	Allow users to set the application name for Spark on Yarn	tgravescs	2013-10-02	1	-0/+1
\|/ /
* /	Update build version in master	Patrick Wendell	2013-09-24	1	-2/+2
\|/
*	Fix typo in Maven build docs	Jey Kottalam	2013-09-15	1	-2/+2
\|
*	Merge pull request #932 from pwendell/mesos-version	Patrick Wendell	2013-09-15	1	-1/+1
\|\ \| \| \| \|	Bumping Mesos version to 0.13.0
\| *	Bumping Mesos version to 0.13.0	Patrick Wendell	2013-09-15	1	-1/+1
\| \|
* \|	Explain yarn.version in Maven build docs	Patrick Wendell	2013-09-15	1	-3/+3
\|/
*	More updates to Spark on Mesos documentation.	Benjamin Hindman	2013-09-11	1	-2/+2
\|
*	Updated Spark on Mesos documentation.	Benjamin Hindman	2013-09-11	1	-17/+16
\|
*	Change port from 3030 to 4040	Patrick Wendell	2013-09-11	4	-8/+8
\|
*	Update Python API features	Matei Zaharia	2013-09-10	1	-1/+1
\|
*	Document fortran dependency for MLBase	Patrick Wendell	2013-09-09	1	-0/+7
\|
*	Small tweaks to MLlib docs	Matei Zaharia	2013-09-08	1	-10/+8
\|
*	Merge pull request #905 from mateiz/docs2	Matei Zaharia	2013-09-08	17	-138/+472
\|\ \| \| \| \|	Job scheduling and cluster mode docs
\| *	Fix some review comments	Matei Zaharia	2013-09-08	2	-2/+2
\| \|
\| *	Updated cluster diagram to show caches	Matei Zaharia	2013-09-08	2	-0/+0
\| \|
\| *	Review comments	Matei Zaharia	2013-09-08	2	-1/+48
\| \|
\| *	Some tweaks to CDH/HDP doc	Matei Zaharia	2013-09-08	1	-10/+52
\| \|
\| *	Added cluster overview doc, made logo higher-resolution, and added more	Matei Zaharia	2013-09-08	7	-15/+88
\| \| \| \| \| \| \| \|	details on monitoring
\| *	More fair scheduler docs and property names.	Matei Zaharia	2013-09-08	7	-76/+164
\| \| \| \| \| \| \| \| \| \|	Also changed uses of "job" terminology to "application" when they referred to an entire Spark program, to avoid confusion.
\| *	Work in progress:	Matei Zaharia	2013-09-08	6	-44/+128
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	- Add job scheduling docs - Rename some fair scheduler properties - Organize intro page better - Link to Apache wiki for "contributing to Spark"
* \|	respose to PR comments	Ameet Talwalkar	2013-09-08	1	-25/+30
\| \|
* \|	Merge remote-tracking branch 'upstream/master'	Ameet Talwalkar	2013-09-08	6	-13/+179
\|\ \
\| * \	Merge pull request #906 from pwendell/ganglia-sink	Patrick Wendell	2013-09-08	1	-0/+9
\| \|\ \ \| \| \|/ \| \|/\|	Clean-up of Metrics Code/Docs and Add Ganglia Sink
\| \| *	Adding more docs and some code cleanup	Patrick Wendell	2013-09-08	1	-0/+9
\| \| \|
\| * \|	Merge pull request #900 from pwendell/cdh-docs	Matei Zaharia	2013-09-08	2	-0/+77
\| \|\ \ \| \| \| \| \| \| \| \|	Provide docs to describe running on CDH/HDP cluster.
\| \| * \|	File rename	Patrick Wendell	2013-09-07	2	-4/+2
\| \| \| \|
\| \| * \|	Changes based on feedback	Patrick Wendell	2013-09-07	1	-12/+24
\| \| \| \|
\| \| * \|	Provide docs to describe running on CDH/HDP cluster.	Patrick Wendell	2013-09-06	2	-0/+67
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This doc consolidates information relevant to CDH/HDP users in a single place.
\| * \| \|	Merge pull request #901 from ooyala/2013-09/0.8-doc-changes	Matei Zaharia	2013-09-07	2	-4/+21
\| \|\ \ \ \| \| \| \| \| \| \| \| \| \|	0.8 Doc changes for make-distribution.sh
\| \| * \| \|	CR feedback from Matei	Evan Chan	2013-09-07	2	-7/+1
\| \| \| \| \|
\| \| * \| \|	Add references to make-distribution.sh	Evan Chan	2013-09-06	2	-0/+19
\| \| \| \| \|
\| \| * \| \|	"launch" scripts is more accurate terminology	Evan Chan	2013-09-06	1	-2/+2
\| \| \| \| \|
\| \| * \| \|	Easier way to start the master	Evan Chan	2013-09-06	1	-1/+1
\| \| \| \| \|