[SPARK-17834][SQL] Fetch the earliest offsets manually in KafkaSource instead of counting on KafkaConsumer - spark

diff options

author	Shixiong Zhu <shixiong@databricks.com>	2016-10-13 13:31:50 -0700
committer	Shixiong Zhu <shixiong@databricks.com>	2016-10-13 13:31:50 -0700
commit	08eac356095c7faa2b19d52f2fb0cbc47eb7d1d1 (patch)
tree	866b9bfaedaa30b1be16ca69f67ca17db80c28ec /python/pyspark/sql/streaming.py
parent	84f149e414475c2e60863898992001c21cfc13b2 (diff)
download	spark-08eac356095c7faa2b19d52f2fb0cbc47eb7d1d1.tar.gz spark-08eac356095c7faa2b19d52f2fb0cbc47eb7d1d1.tar.bz2 spark-08eac356095c7faa2b19d52f2fb0cbc47eb7d1d1.zip

[SPARK-17834][SQL] Fetch the earliest offsets manually in KafkaSource instead of counting on KafkaConsumer

## What changes were proposed in this pull request? Because `KafkaConsumer.poll(0)` may update the partition offsets, this PR just calls `seekToBeginning` to manually set the earliest offsets for the KafkaSource initial offsets. ## How was this patch tested? Existing tests. Author: Shixiong Zhu <shixiong@databricks.com> Closes #15397 from zsxwing/SPARK-17834.

Diffstat (limited to 'python/pyspark/sql/streaming.py')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: