[SPARK-10142] [STREAMING] Made python checkpoint recovery handle non-local checkpoint paths and existing SparkContexts - spark

diff options

author	Tathagata Das <tathagata.das1565@gmail.com>	2015-08-23 19:24:32 -0700
committer	Tathagata Das <tathagata.das1565@gmail.com>	2015-08-23 19:24:42 -0700
commit	b40059dbda4dafbb883a53fbd5c5f69bc01a3e19 (patch)
tree	0d17b29cf9937e799891f5ab15daf8183f19470a /external/flume
parent	00f812d38aeb179290a710e3af1e0c11cc16da71 (diff)
download	spark-b40059dbda4dafbb883a53fbd5c5f69bc01a3e19.tar.gz spark-b40059dbda4dafbb883a53fbd5c5f69bc01a3e19.tar.bz2 spark-b40059dbda4dafbb883a53fbd5c5f69bc01a3e19.zip

[SPARK-10142] [STREAMING] Made python checkpoint recovery handle non-local checkpoint paths and existing SparkContexts

The current code only checks checkpoint files in local filesystem, and always tries to create a new Python SparkContext (even if one already exists). The solution is to do the following: 1. Use the same code path as Java to check whether a valid checkpoint exists 2. Create a new Python SparkContext only if there no active one. There is not test for the path as its hard to test with distributed filesystem paths in a local unit test. I am going to test it with a distributed file system manually to verify that this patch works. Author: Tathagata Das <tathagata.das1565@gmail.com> Closes #8366 from tdas/SPARK-10142 and squashes the following commits: 3afa666 [Tathagata Das] Added tests 2dd4ae5 [Tathagata Das] Added the check to not create a context if one already exists 9bf151b [Tathagata Das] Made python checkpoint recovery use java to find the checkpoint files (cherry picked from commit 053d94fcf32268369b5a40837271f15d6af41aa4) Signed-off-by: Tathagata Das <tathagata.das1565@gmail.com>

Diffstat (limited to 'external/flume')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: