[SPARK-17230] [SQL] Should not pass optimized query into QueryExecution in DataFrameWriter

## What changes were proposed in this pull request? Some analyzer rules have assumptions on logical plans, optimizer may break these assumption, we should not pass an optimized query plan into QueryExecution (will be analyzed again), otherwise we may some weird bugs. For example, we have a rule for decimal calculation to promote the precision before binary operations, use PromotePrecision as placeholder to indicate that this rule should not apply twice. But a Optimizer rule will remove this placeholder, that break the assumption, then the rule applied twice, cause wrong result. Ideally, we should make all the analyzer rules all idempotent, that may require lots of effort to double checking them one by one (may be not easy). An easier approach could be never feed a optimized plan into Analyzer, this PR fix the case for RunnableComand, they will be optimized, during execution, the passed `query` will also be passed into QueryExecution again. This PR make these `query` not part of the children, so they will not be optimized and analyzed again. Right now, we did not know a logical plan is optimized or not, we could introduce a flag for that, and make sure a optimized logical plan will not be analyzed again. ## How was this patch tested? Added regression tests. Author: Davies Liu <davies@databricks.com> Closes #14797 from davies/fix_writer.
author: Davies Liu <davies@databricks.com> 2016-09-02 15:10:12 -0700
committer: Davies Liu <davies.liu@gmail.com> 2016-09-02 15:10:12 -0700
commit: ed9c884dcf925500ceb388b06b33bd2c95cd2ada (patch)
tree: 7394f4a30e5994193b575817c8d768276ea33541 /sql/core/src/test/scala
parent: eac1d0e921345b5d15aa35d8c565140292ab2af3 (diff)
download: spark-ed9c884dcf925500ceb388b06b33bd2c95cd2ada.tar.gz
spark-ed9c884dcf925500ceb388b06b33bd2c95cd2ada.tar.bz2
spark-ed9c884dcf925500ceb388b06b33bd2c95cd2ada.zip
1 files changed, 8 insertions, 0 deletions
diff --git a/sql/core/src/test/scala/org/apache/spark/sql/test/DataFrameReaderWriterSuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/test/DataFrameReaderWriterSuite.scala
index 05935cec4b..63b0e4588e 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/test/DataFrameReaderWriterSuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/test/DataFrameReaderWriterSuite.scala
@@ -449,6 +449,14 @@ class DataFrameReaderWriterSuite extends QueryTest with SharedSQLContext with Be
     }
   }
 
+  test("SPARK-17230: write out results of decimal calculation") {
+    val df = spark.range(99, 101)
+      .selectExpr("id", "cast(id as long) * cast('1.0' as decimal(38, 18)) as num")
+    df.write.mode(SaveMode.Overwrite).parquet(dir)
+    val df2 = spark.read.parquet(dir)
+    checkAnswer(df2, df)
+  }
+
   private def testRead(
       df: => DataFrame,
       expectedResult: Seq[String],
author	Davies Liu <davies@databricks.com>	2016-09-02 15:10:12 -0700
committer	Davies Liu <davies.liu@gmail.com>	2016-09-02 15:10:12 -0700
commit	ed9c884dcf925500ceb388b06b33bd2c95cd2ada (patch)
tree	7394f4a30e5994193b575817c8d768276ea33541 /sql/core/src/test/scala
parent	eac1d0e921345b5d15aa35d8c565140292ab2af3 (diff)
download	spark-ed9c884dcf925500ceb388b06b33bd2c95cd2ada.tar.gz spark-ed9c884dcf925500ceb388b06b33bd2c95cd2ada.tar.bz2 spark-ed9c884dcf925500ceb388b06b33bd2c95cd2ada.zip