[DOCS] Fix typo in docs

## What changes were proposed in this pull request? Fix typo in docs ## How was this patch tested? Author: uncleGen <hustyugm@gmail.com> Closes #16658 from uncleGen/typo-issue.
author: uncleGen <hustyugm@gmail.com> 2017-01-24 11:32:11 +0000
committer: Sean Owen <sowen@cloudera.com> 2017-01-24 11:32:11 +0000
commit: 7c61c2a1c40629311b84dff8d91b257efb345d07 (patch)
tree: 01c01629df495d870228e79496b74b55ede520b7 /docs/streaming-kafka-0-10-integration.md
parent: f27e024768e328b96704a9ef35b77381da480328 (diff)
download: spark-7c61c2a1c40629311b84dff8d91b257efb345d07.tar.gz
spark-7c61c2a1c40629311b84dff8d91b257efb345d07.tar.bz2
spark-7c61c2a1c40629311b84dff8d91b257efb345d07.zip
1 files changed, 1 insertions, 1 deletions
diff --git a/docs/streaming-kafka-0-10-integration.md b/docs/streaming-kafka-0-10-integration.md
index b645d3c3a4..6ef54ac210 100644
--- a/docs/streaming-kafka-0-10-integration.md
+++ b/docs/streaming-kafka-0-10-integration.md
@@ -183,7 +183,7 @@ stream.foreachRDD(new VoidFunction<JavaRDD<ConsumerRecord<String, String>>>() {
 Note that the typecast to `HasOffsetRanges` will only succeed if it is done in the first method called on the result of `createDirectStream`, not later down a chain of methods. Be aware that the one-to-one mapping between RDD partition and Kafka partition does not remain after any methods that shuffle or repartition, e.g. reduceByKey() or window().
 
 ### Storing Offsets
-Kafka delivery semantics in the case of failure depend on how and when offsets are stored.  Spark output operations are [at-least-once](streaming-programming-guide.html#semantics-of-output-operations).  So if you want the equivalent of exactly-once semantics, you must either store offsets after an idempotent output, or store offsets in an atomic transaction alongside output. With this integration, you have 3 options, in order of increasing reliablity (and code complexity), for how to store offsets.
+Kafka delivery semantics in the case of failure depend on how and when offsets are stored.  Spark output operations are [at-least-once](streaming-programming-guide.html#semantics-of-output-operations).  So if you want the equivalent of exactly-once semantics, you must either store offsets after an idempotent output, or store offsets in an atomic transaction alongside output. With this integration, you have 3 options, in order of increasing reliability (and code complexity), for how to store offsets.
 
 #### Checkpoints
 If you enable Spark [checkpointing](streaming-programming-guide.html#checkpointing), offsets will be stored in the checkpoint.  This is easy to enable, but there are drawbacks. Your output operation must be idempotent, since you will get repeated outputs; transactions are not an option.  Furthermore, you cannot recover from a checkpoint if your application code has changed.  For planned upgrades, you can mitigate this by running the new code at the same time as the old code (since outputs need to be idempotent anyway, they should not clash).  But for unplanned failures that require code changes, you will lose data unless you have another way to identify known good starting offsets.
author	uncleGen <hustyugm@gmail.com>	2017-01-24 11:32:11 +0000
committer	Sean Owen <sowen@cloudera.com>	2017-01-24 11:32:11 +0000
commit	7c61c2a1c40629311b84dff8d91b257efb345d07 (patch)
tree	01c01629df495d870228e79496b74b55ede520b7 /docs/streaming-kafka-0-10-integration.md
parent	f27e024768e328b96704a9ef35b77381da480328 (diff)
download	spark-7c61c2a1c40629311b84dff8d91b257efb345d07.tar.gz spark-7c61c2a1c40629311b84dff8d91b257efb345d07.tar.bz2 spark-7c61c2a1c40629311b84dff8d91b257efb345d07.zip