diff options
author | Tathagata Das <tathagata.das1565@gmail.com> | 2015-08-23 19:24:32 -0700 |
---|---|---|
committer | Tathagata Das <tathagata.das1565@gmail.com> | 2015-08-23 19:24:42 -0700 |
commit | b40059dbda4dafbb883a53fbd5c5f69bc01a3e19 (patch) | |
tree | 0d17b29cf9937e799891f5ab15daf8183f19470a /external/flume | |
parent | 00f812d38aeb179290a710e3af1e0c11cc16da71 (diff) | |
download | spark-b40059dbda4dafbb883a53fbd5c5f69bc01a3e19.tar.gz spark-b40059dbda4dafbb883a53fbd5c5f69bc01a3e19.tar.bz2 spark-b40059dbda4dafbb883a53fbd5c5f69bc01a3e19.zip |
[SPARK-10142] [STREAMING] Made python checkpoint recovery handle non-local checkpoint paths and existing SparkContexts
The current code only checks checkpoint files in local filesystem, and always tries to create a new Python SparkContext (even if one already exists). The solution is to do the following:
1. Use the same code path as Java to check whether a valid checkpoint exists
2. Create a new Python SparkContext only if there no active one.
There is not test for the path as its hard to test with distributed filesystem paths in a local unit test. I am going to test it with a distributed file system manually to verify that this patch works.
Author: Tathagata Das <tathagata.das1565@gmail.com>
Closes #8366 from tdas/SPARK-10142 and squashes the following commits:
3afa666 [Tathagata Das] Added tests
2dd4ae5 [Tathagata Das] Added the check to not create a context if one already exists
9bf151b [Tathagata Das] Made python checkpoint recovery use java to find the checkpoint files
(cherry picked from commit 053d94fcf32268369b5a40837271f15d6af41aa4)
Signed-off-by: Tathagata Das <tathagata.das1565@gmail.com>
Diffstat (limited to 'external/flume')
0 files changed, 0 insertions, 0 deletions