diff options
author | Tathagata Das <tathagata.das1565@gmail.com> | 2014-01-28 21:51:05 -0800 |
---|---|---|
committer | Patrick Wendell <pwendell@gmail.com> | 2014-01-28 21:51:05 -0800 |
commit | 793020961489e16e924c4531da3a13884d2b9175 (patch) | |
tree | 54da8edfdcf82072fbd9fb238e653da5b7849448 /docs/img/streaming-dstream-ops.png | |
parent | f8c742ce274fbae2a9e616d4c97469b6a22069bb (diff) | |
download | spark-793020961489e16e924c4531da3a13884d2b9175.tar.gz spark-793020961489e16e924c4531da3a13884d2b9175.tar.bz2 spark-793020961489e16e924c4531da3a13884d2b9175.zip |
Merge pull request #497 from tdas/docs-update
Updated Spark Streaming Programming Guide
Here is the updated version of the Spark Streaming Programming Guide. This is still a work in progress, but the major changes are in place. So feedback is most welcome.
In general, I have tried to make the guide to easier to understand even if the reader does not know much about Spark. The updated website is hosted here -
http://www.eecs.berkeley.edu/~tdas/spark_docs/streaming-programming-guide.html
The major changes are:
- Overview illustrates the usecases of Spark Streaming - various input sources and various output sources
- An example right after overview to quickly give an idea of what Spark Streaming program looks like
- Made Java API and examples a first class citizen like Scala by using tabs to show both Scala and Java examples (similar to AMPCamp tutorial's code tabs)
- Highlighted the DStream operations updateStateByKey and transform because of their powerful nature
- Updated driver node failure recovery text to highlight automatic recovery in Spark standalone mode
- Added information about linking and using the external input sources like Kafka and Flume
- In general, reorganized the sections to better show the Basic section and the more advanced sections like Tuning and Recovery.
Todos:
- Links to the docs of external Kafka, Flume, etc
- Illustrate window operation with figure as well as example.
Author: Tathagata Das <tathagata.das1565@gmail.com>
== Merge branch commits ==
commit 18ff10556570b39d672beeb0a32075215cfcc944
Author: Tathagata Das <tathagata.das1565@gmail.com>
Date: Tue Jan 28 21:49:30 2014 -0800
Fixed a lot of broken links.
commit 34a5a6008dac2e107624c7ff0db0824ee5bae45f
Author: Tathagata Das <tathagata.das1565@gmail.com>
Date: Tue Jan 28 18:02:28 2014 -0800
Updated github url to use SPARK_GITHUB_URL variable.
commit f338a60ae8069e0a382d2cb170227e5757cc0b7a
Author: Tathagata Das <tathagata.das1565@gmail.com>
Date: Mon Jan 27 22:42:42 2014 -0800
More updates based on Patrick and Harvey's comments.
commit 89a81ff25726bf6d26163e0dd938290a79582c0f
Author: Tathagata Das <tathagata.das1565@gmail.com>
Date: Mon Jan 27 13:08:34 2014 -0800
Updated docs based on Patricks PR comments.
commit d5b6196b532b5746e019b959a79ea0cc013a8fc3
Author: Tathagata Das <tathagata.das1565@gmail.com>
Date: Sun Jan 26 20:15:58 2014 -0800
Added spark.streaming.unpersist config and info on StreamingListener interface.
commit e3dcb46ab83d7071f611d9b5008ba6bc16c9f951
Author: Tathagata Das <tathagata.das1565@gmail.com>
Date: Sun Jan 26 18:41:12 2014 -0800
Fixed docs on StreamingContext.getOrCreate.
commit 6c29524639463f11eec721e4d17a9d7159f2944b
Author: Tathagata Das <tathagata.das1565@gmail.com>
Date: Thu Jan 23 18:49:39 2014 -0800
Added example and figure for window operations, and links to Kafka and Flume API docs.
commit f06b964a51bb3b21cde2ff8bdea7d9785f6ce3a9
Author: Tathagata Das <tathagata.das1565@gmail.com>
Date: Wed Jan 22 22:49:12 2014 -0800
Fixed missing endhighlight tag in the MLlib guide.
commit 036a7d46187ea3f2a0fb8349ef78f10d6c0b43a9
Merge: eab351d a1cd185
Author: Tathagata Das <tathagata.das1565@gmail.com>
Date: Wed Jan 22 22:17:42 2014 -0800
Merge remote-tracking branch 'apache/master' into docs-update
commit eab351d05c0baef1d4b549e1581310087158d78d
Author: Tathagata Das <tathagata.das1565@gmail.com>
Date: Wed Jan 22 22:17:15 2014 -0800
Update Spark Streaming Programming Guide.
Diffstat (limited to 'docs/img/streaming-dstream-ops.png')
-rw-r--r-- | docs/img/streaming-dstream-ops.png | bin | 0 -> 48429 bytes |
1 files changed, 0 insertions, 0 deletions
diff --git a/docs/img/streaming-dstream-ops.png b/docs/img/streaming-dstream-ops.png Binary files differnew file mode 100644 index 0000000000..a1c5634aa3 --- /dev/null +++ b/docs/img/streaming-dstream-ops.png |