aboutsummaryrefslogtreecommitdiff
path: root/docs
diff options
context:
space:
mode:
authorSeigneurin, Alexis (CONT) <Alexis.Seigneurin@capitalone.com>2016-08-29 13:12:10 +0100
committerSean Owen <sowen@cloudera.com>2016-08-29 13:12:10 +0100
commit08913ce0002a80a989489a31b7353f5ec4a5849f (patch)
treedbd08672353c21a731b18f592603b77322b2da3f /docs
parent1a48c0047bbdb6328c3ac5ec617a5e35e244d66d (diff)
downloadspark-08913ce0002a80a989489a31b7353f5ec4a5849f.tar.gz
spark-08913ce0002a80a989489a31b7353f5ec4a5849f.tar.bz2
spark-08913ce0002a80a989489a31b7353f5ec4a5849f.zip
fixed a typo
idempotant -> idempotent Author: Seigneurin, Alexis (CONT) <Alexis.Seigneurin@capitalone.com> Closes #14833 from aseigneurin/fix-typo.
Diffstat (limited to 'docs')
-rw-r--r--docs/structured-streaming-programming-guide.md2
1 files changed, 1 insertions, 1 deletions
diff --git a/docs/structured-streaming-programming-guide.md b/docs/structured-streaming-programming-guide.md
index 090b14f4ce..8a88e06ebd 100644
--- a/docs/structured-streaming-programming-guide.md
+++ b/docs/structured-streaming-programming-guide.md
@@ -406,7 +406,7 @@ Furthermore, this model naturally handles data that has arrived later than expec
## Fault Tolerance Semantics
Delivering end-to-end exactly-once semantics was one of key goals behind the design of Structured Streaming. To achieve that, we have designed the Structured Streaming sources, the sinks and the execution engine to reliably track the exact progress of the processing so that it can handle any kind of failure by restarting and/or reprocessing. Every streaming source is assumed to have offsets (similar to Kafka offsets, or Kinesis sequence numbers)
-to track the read position in the stream. The engine uses checkpointing and write ahead logs to record the offset range of the data being processed in each trigger. The streaming sinks are designed to be idempotent for handling reprocessing. Together, using replayable sources and idempotant sinks, Structured Streaming can ensure **end-to-end exactly-once semantics** under any failure.
+to track the read position in the stream. The engine uses checkpointing and write ahead logs to record the offset range of the data being processed in each trigger. The streaming sinks are designed to be idempotent for handling reprocessing. Together, using replayable sources and idempotent sinks, Structured Streaming can ensure **end-to-end exactly-once semantics** under any failure.
# API using Datasets and DataFrames
Since Spark 2.0, DataFrames and Datasets can represent static, bounded data, as well as streaming, unbounded data. Similar to static Datasets/DataFrames, you can use the common entry point `SparkSession` ([Scala](api/scala/index.html#org.apache.spark.sql.SparkSession)/