diff options
author | Gaurav <gaurav@techtinium.com> | 2017-03-06 10:41:49 -0800 |
---|---|---|
committer | Burak Yavuz <brkyvz@gmail.com> | 2017-03-06 10:41:49 -0800 |
commit | 46a64d1e0ae12c31e848f377a84fb28e3efb3699 (patch) | |
tree | f070e6a3646450030a33e66282e85aa1efce6bdb /sbin/start-history-server.sh | |
parent | 339b53a1311e08521d84a83c94201fcf3c766fb2 (diff) | |
download | spark-46a64d1e0ae12c31e848f377a84fb28e3efb3699.tar.gz spark-46a64d1e0ae12c31e848f377a84fb28e3efb3699.tar.bz2 spark-46a64d1e0ae12c31e848f377a84fb28e3efb3699.zip |
[SPARK-19304][STREAMING][KINESIS] fix kinesis slow checkpoint recovery
## What changes were proposed in this pull request?
added a limit to getRecords api call call in KinesisBackedBlockRdd. This helps reduce the amount of data returned by kinesis api call making the recovery considerably faster
As we are storing the `fromSeqNum` & `toSeqNum` in checkpoint metadata, we can also store the number of records. Which can later be used for api call.
## How was this patch tested?
The patch was manually tested
Apologies for any silly mistakes, opening first pull request
Author: Gaurav <gaurav@techtinium.com>
Closes #16842 from Gauravshah/kinesis_checkpoint_recovery_fix_2_1_0.
Diffstat (limited to 'sbin/start-history-server.sh')
0 files changed, 0 insertions, 0 deletions