diff options
author | Tathagata Das <tathagata.das1565@gmail.com> | 2015-12-01 14:08:36 -0800 |
---|---|---|
committer | Andrew Or <andrew@databricks.com> | 2015-12-01 14:08:36 -0800 |
commit | 60b541ee1b97c9e5e84aa2af2ce856f316ad22b3 (patch) | |
tree | bfa408b8238ed10fc0a6bd248efa9f81957598e9 /project | |
parent | 2cef1cdfbb5393270ae83179b6a4e50c3cbf9e93 (diff) | |
download | spark-60b541ee1b97c9e5e84aa2af2ce856f316ad22b3.tar.gz spark-60b541ee1b97c9e5e84aa2af2ce856f316ad22b3.tar.bz2 spark-60b541ee1b97c9e5e84aa2af2ce856f316ad22b3.zip |
[SPARK-12004] Preserve the RDD partitioner through RDD checkpointing
The solution is the save the RDD partitioner in a separate file in the RDD checkpoint directory. That is, `<checkpoint dir>/_partitioner`. In most cases, whether the RDD partitioner was recovered or not, does not affect the correctness, only reduces performance. So this solution makes a best-effort attempt to save and recover the partitioner. If either fails, the checkpointing is not affected. This makes this patch safe and backward compatible.
Author: Tathagata Das <tathagata.das1565@gmail.com>
Closes #9983 from tdas/SPARK-12004.
Diffstat (limited to 'project')
0 files changed, 0 insertions, 0 deletions