aboutsummaryrefslogtreecommitdiff
path: root/python/.gitignore
diff options
context:
space:
mode:
authorTathagata Das <tathagata.das1565@gmail.com>2015-12-01 14:08:36 -0800
committerAndrew Or <andrew@databricks.com>2015-12-01 14:08:36 -0800
commit60b541ee1b97c9e5e84aa2af2ce856f316ad22b3 (patch)
treebfa408b8238ed10fc0a6bd248efa9f81957598e9 /python/.gitignore
parent2cef1cdfbb5393270ae83179b6a4e50c3cbf9e93 (diff)
downloadspark-60b541ee1b97c9e5e84aa2af2ce856f316ad22b3.tar.gz
spark-60b541ee1b97c9e5e84aa2af2ce856f316ad22b3.tar.bz2
spark-60b541ee1b97c9e5e84aa2af2ce856f316ad22b3.zip
[SPARK-12004] Preserve the RDD partitioner through RDD checkpointing
The solution is the save the RDD partitioner in a separate file in the RDD checkpoint directory. That is, `<checkpoint dir>/_partitioner`. In most cases, whether the RDD partitioner was recovered or not, does not affect the correctness, only reduces performance. So this solution makes a best-effort attempt to save and recover the partitioner. If either fails, the checkpointing is not affected. This makes this patch safe and backward compatible. Author: Tathagata Das <tathagata.das1565@gmail.com> Closes #9983 from tdas/SPARK-12004.
Diffstat (limited to 'python/.gitignore')
0 files changed, 0 insertions, 0 deletions