From 8100cbdb7546e8438019443cfc00683017c81278 Mon Sep 17 00:00:00 2001 From: CodingCat Date: Thu, 5 Jun 2014 11:39:35 -0700 Subject: SPARK-1677: allow user to disable output dir existence checking https://issues.apache.org/jira/browse/SPARK-1677 For compatibility with older versions of Spark it would be nice to have an option `spark.hadoop.validateOutputSpecs` (default true) for the user to disable the output directory existence checking Author: CodingCat Closes #947 from CodingCat/SPARK-1677 and squashes the following commits: 7930f83 [CodingCat] miao c0c0e03 [CodingCat] bug fix and doc update 5318562 [CodingCat] bug fix 13219b5 [CodingCat] allow user to disable output dir existence checking (cherry picked from commit 89cdbb087cb2f0d03be2dd77440300c6bd61c792) Signed-off-by: Patrick Wendell --- docs/configuration.md | 8 ++++++++ 1 file changed, 8 insertions(+) (limited to 'docs') diff --git a/docs/configuration.md b/docs/configuration.md index 0697f7fc2f..71fafa5734 100644 --- a/docs/configuration.md +++ b/docs/configuration.md @@ -487,6 +487,14 @@ Apart from these, the following properties are also available, and may be useful this duration will be cleared as well. + + spark.hadoop.validateOutputSpecs + true + If set to true, validates the output specification (e.g. checking if the output directory already exists) + used in saveAsHadoopFile and other variants. This can be disabled to silence exceptions due to pre-existing + output directories. We recommend that users do not disable this except if trying to achieve compatibility with + previous versions of Spark. Simply use Hadoop's FileSystem API to delete output directories by hand. + #### Networking -- cgit v1.2.3