aboutsummaryrefslogtreecommitdiff
path: root/python/pyspark/streaming
diff options
context:
space:
mode:
authorTakeshi YAMAMURO <linguin.m.s@gmail.com>2017-01-25 17:38:48 -0800
committerBurak Yavuz <brkyvz@gmail.com>2017-01-25 17:38:48 -0800
commit256a3a801366ab9f705e50690114e49fdb49b38e (patch)
treed1d6eacbc69e23c8cfe27957172e38dba983939f /python/pyspark/streaming
parent2338451266d37b4c952827325cdee53b3e8fbc78 (diff)
downloadspark-256a3a801366ab9f705e50690114e49fdb49b38e.tar.gz
spark-256a3a801366ab9f705e50690114e49fdb49b38e.tar.bz2
spark-256a3a801366ab9f705e50690114e49fdb49b38e.zip
[SPARK-18020][STREAMING][KINESIS] Checkpoint SHARD_END to finish reading closed shards
## What changes were proposed in this pull request? This pr is to fix an issue occurred when resharding Kinesis streams; the resharding makes the KCL throw an exception because Spark does not checkpoint `SHARD_END` when finishing reading closed shards in `KinesisRecordProcessor#shutdown`. This bug finally leads to stopping subscribing new split (or merged) shards. ## How was this patch tested? Added a test in `KinesisStreamSuite` to check if it works well when splitting/merging shards. Author: Takeshi YAMAMURO <linguin.m.s@gmail.com> Closes #16213 from maropu/SPARK-18020.
Diffstat (limited to 'python/pyspark/streaming')
-rw-r--r--python/pyspark/streaming/tests.py2
1 files changed, 1 insertions, 1 deletions
diff --git a/python/pyspark/streaming/tests.py b/python/pyspark/streaming/tests.py
index 5ac007cd59..2e8ed69827 100644
--- a/python/pyspark/streaming/tests.py
+++ b/python/pyspark/streaming/tests.py
@@ -1420,7 +1420,7 @@ class KinesisStreamTests(PySparkStreamingTestCase):
import random
kinesisAppName = ("KinesisStreamTests-%d" % abs(random.randint(0, 10000000)))
- kinesisTestUtils = self.ssc._jvm.org.apache.spark.streaming.kinesis.KinesisTestUtils()
+ kinesisTestUtils = self.ssc._jvm.org.apache.spark.streaming.kinesis.KinesisTestUtils(2)
try:
kinesisTestUtils.createStream()
aWSCredentials = kinesisTestUtils.getAWSCredentials()