aboutsummaryrefslogtreecommitdiff
path: root/docs/streaming-kafka-integration.md
diff options
context:
space:
mode:
authorNick Evans <me@nicolasevans.org>2015-11-11 13:29:30 -0800
committerTathagata Das <tathagata.das1565@gmail.com>2015-11-11 13:29:30 -0800
commitdd77e278b99e45c20fdefb1c795f3c5148d577db (patch)
tree186cef9414da85d54d2f1621c5173fcba2ba0201 /docs/streaming-kafka-integration.md
parenta9a6b80c718008aac7c411dfe46355efe58dee2e (diff)
downloadspark-dd77e278b99e45c20fdefb1c795f3c5148d577db.tar.gz
spark-dd77e278b99e45c20fdefb1c795f3c5148d577db.tar.bz2
spark-dd77e278b99e45c20fdefb1c795f3c5148d577db.zip
[SPARK-11335][STREAMING] update kafka direct python docs on how to get the offset ranges for a KafkaRDD
tdas koeninger This updates the Spark Streaming + Kafka Integration Guide doc with a working method to access the offsets of a `KafkaRDD` through Python. Author: Nick Evans <me@nicolasevans.org> Closes #9289 from manygrams/update_kafka_direct_python_docs.
Diffstat (limited to 'docs/streaming-kafka-integration.md')
-rw-r--r--docs/streaming-kafka-integration.md15
1 files changed, 14 insertions, 1 deletions
diff --git a/docs/streaming-kafka-integration.md b/docs/streaming-kafka-integration.md
index ab7f0117c0..b00351b2fb 100644
--- a/docs/streaming-kafka-integration.md
+++ b/docs/streaming-kafka-integration.md
@@ -181,7 +181,20 @@ Next, we discuss how to use this approach in your streaming application.
);
</div>
<div data-lang="python" markdown="1">
- Not supported yet
+ offsetRanges = []
+
+ def storeOffsetRanges(rdd):
+ global offsetRanges
+ offsetRanges = rdd.offsetRanges()
+ return rdd
+
+ def printOffsetRanges(rdd):
+ for o in offsetRanges:
+ print "%s %s %s %s" % (o.topic, o.partition, o.fromOffset, o.untilOffset)
+
+ directKafkaStream\
+ .transform(storeOffsetRanges)\
+ .foreachRDD(printOffsetRanges)
</div>
</div>