aboutsummaryrefslogtreecommitdiff
path: root/docs
diff options
context:
space:
mode:
authorKousuke Saruta <sarutak@oss.nttdata.co.jp>2015-06-09 12:19:01 +0100
committerSean Owen <sowen@cloudera.com>2015-06-09 12:19:01 +0100
commite6fb6cedf3ecbde6f01d4753d7d05d0c52827fce (patch)
tree277c001d4ae664d335532abfbdb8c50b30fe0965 /docs
parent1b499993ad185b04dd5065facb565cbe7e249521 (diff)
downloadspark-e6fb6cedf3ecbde6f01d4753d7d05d0c52827fce.tar.gz
spark-e6fb6cedf3ecbde6f01d4753d7d05d0c52827fce.tar.bz2
spark-e6fb6cedf3ecbde6f01d4753d7d05d0c52827fce.zip
[STREAMING] [DOC] Remove duplicated description about WAL
I noticed there is a duplicated description about WAL. ``` To ensure zero-data loss, you have to additionally enable Write Ahead Logs in Spark Streaming. To ensure zero data loss, enable the Write Ahead Logs (introduced in Spark 1.2). ``` Let's remove the duplication. I don't file this issue in JIRA because it's minor. Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp> Closes #6719 from sarutak/remove-multiple-description and squashes the following commits: cc9bb21 [Kousuke Saruta] Removed duplicated description about WAL
Diffstat (limited to 'docs')
-rw-r--r--docs/streaming-kafka-integration.md2
1 files changed, 1 insertions, 1 deletions
diff --git a/docs/streaming-kafka-integration.md b/docs/streaming-kafka-integration.md
index d6d5605948..998c8c994e 100644
--- a/docs/streaming-kafka-integration.md
+++ b/docs/streaming-kafka-integration.md
@@ -7,7 +7,7 @@ title: Spark Streaming + Kafka Integration Guide
## Approach 1: Receiver-based Approach
This approach uses a Receiver to receive the data. The Received is implemented using the Kafka high-level consumer API. As with all receivers, the data received from Kafka through a Receiver is stored in Spark executors, and then jobs launched by Spark Streaming processes the data.
-However, under default configuration, this approach can lose data under failures (see [receiver reliability](streaming-programming-guide.html#receiver-reliability). To ensure zero-data loss, you have to additionally enable Write Ahead Logs in Spark Streaming. To ensure zero data loss, enable the Write Ahead Logs (introduced in Spark 1.2). This synchronously saves all the received Kafka data into write ahead logs on a distributed file system (e.g HDFS), so that all the data can be recovered on failure. See [Deploying section](streaming-programming-guide.html#deploying-applications) in the streaming programming guide for more details on Write Ahead Logs.
+However, under default configuration, this approach can lose data under failures (see [receiver reliability](streaming-programming-guide.html#receiver-reliability). To ensure zero-data loss, you have to additionally enable Write Ahead Logs in Spark Streaming (introduced in Spark 1.2). This synchronously saves all the received Kafka data into write ahead logs on a distributed file system (e.g HDFS), so that all the data can be recovered on failure. See [Deploying section](streaming-programming-guide.html#deploying-applications) in the streaming programming guide for more details on Write Ahead Logs.
Next, we discuss how to use this approach in your streaming application.