aboutsummaryrefslogtreecommitdiff
path: root/docs
diff options
context:
space:
mode:
authorWenchen Fan <cloud0fan@outlook.com>2015-09-14 11:51:39 -0700
committerYin Huai <yhuai@databricks.com>2015-09-14 11:51:39 -0700
commit32407bfd2bdbf84d65cacfa7554dae6a2332bc37 (patch)
tree2b8ecb4d07e4f5290b59039002f58e4be58f58d6 /docs
parentd81565465cc6d4f38b4ed78036cded630c700388 (diff)
downloadspark-32407bfd2bdbf84d65cacfa7554dae6a2332bc37.tar.gz
spark-32407bfd2bdbf84d65cacfa7554dae6a2332bc37.tar.bz2
spark-32407bfd2bdbf84d65cacfa7554dae6a2332bc37.zip
[SPARK-9899] [SQL] log warning for direct output committer with speculation enabled
This is a follow-up of https://github.com/apache/spark/pull/8317. When speculation is enabled, there may be multiply tasks writing to the same path. Generally it's OK as we will write to a temporary directory first and only one task can commit the temporary directory to target path. However, when we use direct output committer, tasks will write data to target path directly without temporary directory. This causes problems like corrupted data. Please see [PR comment](https://github.com/apache/spark/pull/8191#issuecomment-131598385) for more details. Unfortunately, we don't have a simple flag to tell if a output committer will write to temporary directory or not, so for safety, we have to disable any customized output committer when `speculation` is true. Author: Wenchen Fan <cloud0fan@outlook.com> Closes #8687 from cloud-fan/direct-committer.
Diffstat (limited to 'docs')
0 files changed, 0 insertions, 0 deletions