spark - Mirror of Apache Spark

	Commit message (Collapse)	Author	Age	Files	Lines
*	Typo: Standlone -> Standalone	Andrew Ash	2014-02-14	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	Author: Andrew Ash <andrew@andrewash.com> Closes #601 from ash211/typo and squashes the following commits: 9cd43ac [Andrew Ash] Change docs references to metrics.properties, not metrics.conf 3813ff1 [Andrew Ash] Typo: mulitcast -> multicast 873bd2f [Andrew Ash] Typo: Standlone -> Standalone
*	Merge pull request #497 from tdas/docs-update	Tathagata Das	2014-01-28	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Updated Spark Streaming Programming Guide Here is the updated version of the Spark Streaming Programming Guide. This is still a work in progress, but the major changes are in place. So feedback is most welcome. In general, I have tried to make the guide to easier to understand even if the reader does not know much about Spark. The updated website is hosted here - http://www.eecs.berkeley.edu/~tdas/spark_docs/streaming-programming-guide.html The major changes are: - Overview illustrates the usecases of Spark Streaming - various input sources and various output sources - An example right after overview to quickly give an idea of what Spark Streaming program looks like - Made Java API and examples a first class citizen like Scala by using tabs to show both Scala and Java examples (similar to AMPCamp tutorial's code tabs) - Highlighted the DStream operations updateStateByKey and transform because of their powerful nature - Updated driver node failure recovery text to highlight automatic recovery in Spark standalone mode - Added information about linking and using the external input sources like Kafka and Flume - In general, reorganized the sections to better show the Basic section and the more advanced sections like Tuning and Recovery. Todos: - Links to the docs of external Kafka, Flume, etc - Illustrate window operation with figure as well as example. Author: Tathagata Das <tathagata.das1565@gmail.com> == Merge branch commits == commit 18ff10556570b39d672beeb0a32075215cfcc944 Author: Tathagata Das <tathagata.das1565@gmail.com> Date: Tue Jan 28 21:49:30 2014 -0800 Fixed a lot of broken links. commit 34a5a6008dac2e107624c7ff0db0824ee5bae45f Author: Tathagata Das <tathagata.das1565@gmail.com> Date: Tue Jan 28 18:02:28 2014 -0800 Updated github url to use SPARK_GITHUB_URL variable. commit f338a60ae8069e0a382d2cb170227e5757cc0b7a Author: Tathagata Das <tathagata.das1565@gmail.com> Date: Mon Jan 27 22:42:42 2014 -0800 More updates based on Patrick and Harvey's comments. commit 89a81ff25726bf6d26163e0dd938290a79582c0f Author: Tathagata Das <tathagata.das1565@gmail.com> Date: Mon Jan 27 13:08:34 2014 -0800 Updated docs based on Patricks PR comments. commit d5b6196b532b5746e019b959a79ea0cc013a8fc3 Author: Tathagata Das <tathagata.das1565@gmail.com> Date: Sun Jan 26 20:15:58 2014 -0800 Added spark.streaming.unpersist config and info on StreamingListener interface. commit e3dcb46ab83d7071f611d9b5008ba6bc16c9f951 Author: Tathagata Das <tathagata.das1565@gmail.com> Date: Sun Jan 26 18:41:12 2014 -0800 Fixed docs on StreamingContext.getOrCreate. commit 6c29524639463f11eec721e4d17a9d7159f2944b Author: Tathagata Das <tathagata.das1565@gmail.com> Date: Thu Jan 23 18:49:39 2014 -0800 Added example and figure for window operations, and links to Kafka and Flume API docs. commit f06b964a51bb3b21cde2ff8bdea7d9785f6ce3a9 Author: Tathagata Das <tathagata.das1565@gmail.com> Date: Wed Jan 22 22:49:12 2014 -0800 Fixed missing endhighlight tag in the MLlib guide. commit 036a7d46187ea3f2a0fb8349ef78f10d6c0b43a9 Merge: eab351d a1cd185 Author: Tathagata Das <tathagata.das1565@gmail.com> Date: Wed Jan 22 22:17:42 2014 -0800 Merge remote-tracking branch 'apache/master' into docs-update commit eab351d05c0baef1d4b549e1581310087158d78d Author: Tathagata Das <tathagata.das1565@gmail.com> Date: Wed Jan 22 22:17:15 2014 -0800 Update Spark Streaming Programming Guide.
*	Merge remote-tracking branch 'apache-github/master' into standalone-driver	Patrick Wendell	2014-01-08	1	-0/+10
\|\ \| \| \| \| \| \| \| \| \| \|	Conflicts: core/src/test/scala/org/apache/spark/deploy/JsonProtocolSuite.scala pom.xml
\| *	Add way to limit default # of cores used by applications on standalone mode	Matei Zaharia	2014-01-07	1	-0/+10
\| \| \| \| \| \| \| \|	Also documents the spark.deploy.spreadOut option.
* \|	Fixes	Patrick Wendell	2014-01-08	1	-2/+3
\| \|
* \|	Some doc fixes	Patrick Wendell	2014-01-06	1	-3/+2
\| \|
* \|	Merge remote-tracking branch 'apache-github/master' into standalone-driver	Patrick Wendell	2014-01-06	1	-14/+21
\|\\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Conflicts: core/src/main/scala/org/apache/spark/deploy/client/AppClient.scala core/src/main/scala/org/apache/spark/deploy/client/TestClient.scala core/src/main/scala/org/apache/spark/deploy/master/Master.scala core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala core/src/main/scala/org/apache/spark/scheduler/cluster/SparkDeploySchedulerBackend.scala
\| *	sbin/spark-class* -> bin/spark-class*	Prashant Sharma	2014-01-03	1	-1/+1
\| \|
\| *	a few left over document change	Prashant Sharma	2014-01-02	1	-1/+1
\| \|
\| *	spark-shell -> bin/spark-shell	Prashant Sharma	2014-01-02	1	-2/+2
\| \|
\| *	Merge branch 'scripts-reorg' of github.com:shane-huang/incubator-spark into ↵	Prashant Sharma	2014-01-02	1	-7/+7
\| \|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	spark-915-segregate-scripts Conflicts: bin/spark-shell core/pom.xml core/src/main/scala/org/apache/spark/SparkContext.scala core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/CoarseMesosSchedulerBackend.scala core/src/main/scala/org/apache/spark/ui/UIWorkloadGenerator.scala core/src/test/scala/org/apache/spark/DriverSuite.scala python/run-tests sbin/compute-classpath.sh sbin/spark-class sbin/stop-slaves.sh
\| \| *	add admin scripts to sbin	shane-huang	2013-09-23	1	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: shane-huang <shengsheng.huang@intel.com>
\| \| *	added spark-class and spark-executor to sbin	shane-huang	2013-09-23	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: shane-huang <shengsheng.huang@intel.com>
\| * \|	Updated docs for SparkConf and handled review comments	Matei Zaharia	2013-12-30	1	-4/+11
\| \| \|
* \| \|	Documentation and adding supervise option	Patrick Wendell	2013-12-29	1	-5/+33
\|/ /
* \|	Typo: applicaton	Andrew Ash	2013-12-04	1	-2/+2
\| \|
* \|	Minor clarification and cleanup to spark-standalone.md	Aaron Davidson	2013-10-10	1	-10/+33
\| \|
* \|	Address Matei's comments on documentation	Aaron Davidson	2013-10-10	1	-14/+21
\| \| \| \| \| \| \| \|	Updates to the documentation and changing some logError()s to logWarning()s.
* \|	Add docs for standalone scheduler fault tolerance	Aaron Davidson	2013-10-08	1	-0/+45
\|/ \| \| \|	Also fix a couple HTML/Markdown issues in other files.
*	More fair scheduler docs and property names.	Matei Zaharia	2013-09-08	1	-12/+14
\| \| \| \| \|	Also changed uses of "job" terminology to "application" when they referred to an entire Spark program, to avoid confusion.
*	CR feedback from Matei	Evan Chan	2013-09-07	1	-4/+1
\|
*	Add references to make-distribution.sh	Evan Chan	2013-09-06	1	-0/+11
\|
*	"launch" scripts is more accurate terminology	Evan Chan	2013-09-06	1	-2/+2
\|
*	Easier way to start the master	Evan Chan	2013-09-06	1	-1/+1
\|
*	Add notes about starting spark-shell	Evan Chan	2013-09-06	1	-1/+5
\|
*	Run script fixes for Windows after package & assembly change	Matei Zaharia	2013-09-01	1	-2/+2
\|
*	More updates, describing changes to recommended use of environment vars	Matei Zaharia	2013-08-31	1	-27/+25
\| \| \| \|	and new Python stuff
*	Change build and run instructions to use assemblies	Matei Zaharia	2013-08-29	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit makes Spark invocation saner by using an assembly JAR to find all of Spark's dependencies instead of adding all the JARs in lib_managed. It also packages the examples into an assembly and uses that as SPARK_EXAMPLES_JAR. Finally, it replaces the old "run" script with two better-named scripts: "run-examples" for examples, and "spark-class" for Spark internal classes (e.g. REPL, master, etc). This is also designed to minimize the confusion people have in trying to use "run" to run their own classes; it's not meant to do that, but now at least if they look at it, they can modify run-examples to do a decent job for them. As part of this, Bagel's examples are also now properly moved to the examples package instead of bagel.
*	Update spark-standalone.md	Andrew Ash	2013-08-07	1	-1/+1
\|
*	Use a separate memory setting for standalone cluster daemons	Matei Zaharia	2013-02-10	1	-0/+8
\| \| \| \| \|	Conflicts: docs/_config.yml
*	Clarify the documentation on env variables for standalone mode	Matei Zaharia	2013-01-21	1	-22/+21
\|
*	Use spark-env.sh to configure standalone master. See SPARK-638.	Josh Rosen	2012-12-14	1	-1/+1
\| \| \| \|	Also fixed a typo in the standalone mode documentation.
*	Adds liquid variables to docs templating system so that they can be used	Andy Konwinski	2012-10-08	1	-2/+2
\| \| \| \| \| \| \| \| \|	throughout the docs: SPARK_VERSION, SCALA_VERSION, and MESOS_VERSION. To use them, e.g. use {{site.SPARK_VERSION}}. Also removes uses of {{HOME_PATH}} which were being resolved to "" by the templating system anyway.
*	Updates to standalone cluster, web UI and deploy docs.	Matei Zaharia	2012-09-26	1	-64/+130
\|
*	More updates to documentation	Matei Zaharia	2012-09-25	1	-2/+2
\|
*	- Add docs/api to .gitignore	Andy Konwinski	2012-09-16	1	-7/+13
\| \| \| \| \| \| \| \| \| \| \| \| \|	- Rework/expand the nav bar with more of the docs site - Removing parts of docs about EC2 and Mesos that differentiate between running 0.5 and before - Merged subheadings from running-on-amazon-ec2.html that are still relevant (i.e., "Using a newer version of Spark" and "Accessing Data in S3") into ec2-scripts.html and deleted running-on-amazon-ec2.html - Added some TODO comments to a few docs - Updated the blurb about AMP Camp - Renamed programming-guide to spark-programming-guide - Fixing typos/etc. in Standalone Spark doc
*	Added standalone and YARN docs. Merged standalone cluster into standalone doc	Denny	2012-09-13	1	-0/+80