| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
| |
Author: Andrew Ash <andrew@andrewash.com>
Closes #601 from ash211/typo and squashes the following commits:
9cd43ac [Andrew Ash] Change docs references to metrics.properties, not metrics.conf
3813ff1 [Andrew Ash] Typo: mulitcast -> multicast
873bd2f [Andrew Ash] Typo: Standlone -> Standalone
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Updated Spark Streaming Programming Guide
Here is the updated version of the Spark Streaming Programming Guide. This is still a work in progress, but the major changes are in place. So feedback is most welcome.
In general, I have tried to make the guide to easier to understand even if the reader does not know much about Spark. The updated website is hosted here -
http://www.eecs.berkeley.edu/~tdas/spark_docs/streaming-programming-guide.html
The major changes are:
- Overview illustrates the usecases of Spark Streaming - various input sources and various output sources
- An example right after overview to quickly give an idea of what Spark Streaming program looks like
- Made Java API and examples a first class citizen like Scala by using tabs to show both Scala and Java examples (similar to AMPCamp tutorial's code tabs)
- Highlighted the DStream operations updateStateByKey and transform because of their powerful nature
- Updated driver node failure recovery text to highlight automatic recovery in Spark standalone mode
- Added information about linking and using the external input sources like Kafka and Flume
- In general, reorganized the sections to better show the Basic section and the more advanced sections like Tuning and Recovery.
Todos:
- Links to the docs of external Kafka, Flume, etc
- Illustrate window operation with figure as well as example.
Author: Tathagata Das <tathagata.das1565@gmail.com>
== Merge branch commits ==
commit 18ff10556570b39d672beeb0a32075215cfcc944
Author: Tathagata Das <tathagata.das1565@gmail.com>
Date: Tue Jan 28 21:49:30 2014 -0800
Fixed a lot of broken links.
commit 34a5a6008dac2e107624c7ff0db0824ee5bae45f
Author: Tathagata Das <tathagata.das1565@gmail.com>
Date: Tue Jan 28 18:02:28 2014 -0800
Updated github url to use SPARK_GITHUB_URL variable.
commit f338a60ae8069e0a382d2cb170227e5757cc0b7a
Author: Tathagata Das <tathagata.das1565@gmail.com>
Date: Mon Jan 27 22:42:42 2014 -0800
More updates based on Patrick and Harvey's comments.
commit 89a81ff25726bf6d26163e0dd938290a79582c0f
Author: Tathagata Das <tathagata.das1565@gmail.com>
Date: Mon Jan 27 13:08:34 2014 -0800
Updated docs based on Patricks PR comments.
commit d5b6196b532b5746e019b959a79ea0cc013a8fc3
Author: Tathagata Das <tathagata.das1565@gmail.com>
Date: Sun Jan 26 20:15:58 2014 -0800
Added spark.streaming.unpersist config and info on StreamingListener interface.
commit e3dcb46ab83d7071f611d9b5008ba6bc16c9f951
Author: Tathagata Das <tathagata.das1565@gmail.com>
Date: Sun Jan 26 18:41:12 2014 -0800
Fixed docs on StreamingContext.getOrCreate.
commit 6c29524639463f11eec721e4d17a9d7159f2944b
Author: Tathagata Das <tathagata.das1565@gmail.com>
Date: Thu Jan 23 18:49:39 2014 -0800
Added example and figure for window operations, and links to Kafka and Flume API docs.
commit f06b964a51bb3b21cde2ff8bdea7d9785f6ce3a9
Author: Tathagata Das <tathagata.das1565@gmail.com>
Date: Wed Jan 22 22:49:12 2014 -0800
Fixed missing endhighlight tag in the MLlib guide.
commit 036a7d46187ea3f2a0fb8349ef78f10d6c0b43a9
Merge: eab351d a1cd185
Author: Tathagata Das <tathagata.das1565@gmail.com>
Date: Wed Jan 22 22:17:42 2014 -0800
Merge remote-tracking branch 'apache/master' into docs-update
commit eab351d05c0baef1d4b549e1581310087158d78d
Author: Tathagata Das <tathagata.das1565@gmail.com>
Date: Wed Jan 22 22:17:15 2014 -0800
Update Spark Streaming Programming Guide.
|
|\
| |
| |
| |
| |
| | |
Conflicts:
core/src/test/scala/org/apache/spark/deploy/JsonProtocolSuite.scala
pom.xml
|
| |
| |
| |
| | |
Also documents the spark.deploy.spreadOut option.
|
| | |
|
| | |
|
|\|
| |
| |
| |
| |
| |
| |
| |
| | |
Conflicts:
core/src/main/scala/org/apache/spark/deploy/client/AppClient.scala
core/src/main/scala/org/apache/spark/deploy/client/TestClient.scala
core/src/main/scala/org/apache/spark/deploy/master/Master.scala
core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala
core/src/main/scala/org/apache/spark/scheduler/cluster/SparkDeploySchedulerBackend.scala
|
| | |
|
| | |
|
| | |
|
| |\
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
spark-915-segregate-scripts
Conflicts:
bin/spark-shell
core/pom.xml
core/src/main/scala/org/apache/spark/SparkContext.scala
core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/CoarseMesosSchedulerBackend.scala
core/src/main/scala/org/apache/spark/ui/UIWorkloadGenerator.scala
core/src/test/scala/org/apache/spark/DriverSuite.scala
python/run-tests
sbin/compute-classpath.sh
sbin/spark-class
sbin/stop-slaves.sh
|
| | |
| | |
| | |
| | | |
Signed-off-by: shane-huang <shengsheng.huang@intel.com>
|
| | |
| | |
| | |
| | | |
Signed-off-by: shane-huang <shengsheng.huang@intel.com>
|
| | | |
|
|/ / |
|
| | |
|
| | |
|
| |
| |
| |
| | |
Updates to the documentation and changing some logError()s to logWarning()s.
|
|/
|
|
| |
Also fix a couple HTML/Markdown issues in other files.
|
|
|
|
|
| |
Also changed uses of "job" terminology to "application" when they
referred to an entire Spark program, to avoid confusion.
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
| |
and new Python stuff
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit makes Spark invocation saner by using an assembly JAR to
find all of Spark's dependencies instead of adding all the JARs in
lib_managed. It also packages the examples into an assembly and uses
that as SPARK_EXAMPLES_JAR. Finally, it replaces the old "run" script
with two better-named scripts: "run-examples" for examples, and
"spark-class" for Spark internal classes (e.g. REPL, master, etc). This
is also designed to minimize the confusion people have in trying to use
"run" to run their own classes; it's not meant to do that, but now at
least if they look at it, they can modify run-examples to do a decent
job for them.
As part of this, Bagel's examples are also now properly moved to the
examples package instead of bagel.
|
| |
|
|
|
|
|
| |
Conflicts:
docs/_config.yml
|
| |
|
|
|
|
| |
Also fixed a typo in the standalone mode documentation.
|
|
|
|
|
|
|
|
|
| |
throughout the docs: SPARK_VERSION, SCALA_VERSION, and MESOS_VERSION.
To use them, e.g. use {{site.SPARK_VERSION}}.
Also removes uses of {{HOME_PATH}} which were being resolved to ""
by the templating system anyway.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Rework/expand the nav bar with more of the docs site
- Removing parts of docs about EC2 and Mesos that differentiate between
running 0.5 and before
- Merged subheadings from running-on-amazon-ec2.html that are still relevant
(i.e., "Using a newer version of Spark" and "Accessing Data in S3") into
ec2-scripts.html and deleted running-on-amazon-ec2.html
- Added some TODO comments to a few docs
- Updated the blurb about AMP Camp
- Renamed programming-guide to spark-programming-guide
- Fixing typos/etc. in Standalone Spark doc
|
|
|