aboutsummaryrefslogtreecommitdiff
path: root/sql/core
diff options
context:
space:
mode:
authorJuliusz Sompolski <julek@databricks.com>2017-03-16 08:20:47 +0800
committerWenchen Fan <wenchen@databricks.com>2017-03-16 08:20:47 +0800
commit339b237dc18d4367b0735236b4b8be2901fcad79 (patch)
treeff4fcc832885207a26f7170acbbf0275e8e02de4 /sql/core
parent7d734a658349e8691d8b4294454c9cd98d555014 (diff)
downloadspark-339b237dc18d4367b0735236b4b8be2901fcad79.tar.gz
spark-339b237dc18d4367b0735236b4b8be2901fcad79.tar.bz2
spark-339b237dc18d4367b0735236b4b8be2901fcad79.zip
[SPARK-19948] Document that saveAsTable uses catalog as source of truth for table existence.
It is quirky behaviour that saveAsTable to e.g. a JDBC source with SaveMode other than Overwrite will nevertheless overwrite the table in the external source, if that table was not a catalog table. Author: Juliusz Sompolski <julek@databricks.com> Closes #17289 from juliuszsompolski/saveAsTableDoc.
Diffstat (limited to 'sql/core')
-rw-r--r--sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala5
1 files changed, 5 insertions, 0 deletions
diff --git a/sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala b/sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala
index deaa800694..3e975ef6a3 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala
@@ -337,6 +337,11 @@ final class DataFrameWriter[T] private[sql](ds: Dataset[T]) {
* +---+---+
* }}}
*
+ * In this method, save mode is used to determine the behavior if the data source table exists in
+ * Spark catalog. We will always overwrite the underlying data of data source (e.g. a table in
+ * JDBC data source) if the table doesn't exist in Spark catalog, and will always append to the
+ * underlying data of data source if the table already exists.
+ *
* When the DataFrame is created from a non-partitioned `HadoopFsRelation` with a single input
* path, and the data source provider can be mapped to an existing Hive builtin SerDe (i.e. ORC
* and Parquet), the table is persisted in a Hive compatible format, which means other systems