[SPARK-16839][SQL] Simplify Struct creation code path

## What changes were proposed in this pull request? Simplify struct creation, especially the aspect of `CleanupAliases` which missed some aliases when handling trees created by `CreateStruct`. This PR includes: 1. A failing test (create struct with nested aliases, some of the aliases survive `CleanupAliases`). 2. A fix that transforms `CreateStruct` into a `CreateNamedStruct` constructor, effectively eliminating `CreateStruct` from all expression trees. 3. A `NamePlaceHolder` used by `CreateStruct` when column names cannot be extracted from unresolved `NamedExpression`. 4. A new Analyzer rule that resolves `NamePlaceHolder` into a string literal once the `NamedExpression` is resolved. 5. `CleanupAliases` code was simplified as it no longer has to deal with `CreateStruct`'s top level columns. ## How was this patch tested? Running all tests-suits in package org.apache.spark.sql, especially including the analysis suite, making sure added test initially fails, after applying suggested fix rerun the entire analysis package successfully. Modified few tests that expected `CreateStruct` which is now transformed into `CreateNamedStruct`. Author: eyal farago <eyal farago> Author: Herman van Hovell <hvanhovell@databricks.com> Author: eyal farago <eyal.farago@gmail.com> Author: Eyal Farago <eyal.farago@actimize.com> Author: Hyukjin Kwon <gurwls223@gmail.com> Author: eyalfa <eyal.farago@gmail.com> Closes #15718 from hvanhovell/SPARK-16839-2.
author: eyal farago <eyal farago> 2016-11-02 11:12:20 +0100
committer: Herman van Hovell <hvanhovell@databricks.com> 2016-11-02 11:12:20 +0100
commit: f151bd1af8a05d4b6c901ebe6ac0b51a4a1a20df (patch)
tree: ca9c328efdb1cf9961223196b0396800714eb72b /R/pkg
parent: 9c8deef64efee20a0ddc9b612f90e77c80aede60 (diff)
download: spark-f151bd1af8a05d4b6c901ebe6ac0b51a4a1a20df.tar.gz
spark-f151bd1af8a05d4b6c901ebe6ac0b51a4a1a20df.tar.bz2
spark-f151bd1af8a05d4b6c901ebe6ac0b51a4a1a20df.zip
1 files changed, 6 insertions, 6 deletions
diff --git a/R/pkg/inst/tests/testthat/test_sparkSQL.R b/R/pkg/inst/tests/testthat/test_sparkSQL.R
index 806019d752..d7fe6b3282 100644
--- a/R/pkg/inst/tests/testthat/test_sparkSQL.R
+++ b/R/pkg/inst/tests/testthat/test_sparkSQL.R
@@ -1222,16 +1222,16 @@ test_that("column functions", {
   # Test struct()
   df <- createDataFrame(list(list(1L, 2L, 3L), list(4L, 5L, 6L)),
                         schema = c("a", "b", "c"))
-  result <- collect(select(df, struct("a", "c")))
+  result <- collect(select(df, alias(struct("a", "c"), "d")))
   expected <- data.frame(row.names = 1:2)
-  expected$"struct(a, c)" <- list(listToStruct(list(a = 1L, c = 3L)),
-                                 listToStruct(list(a = 4L, c = 6L)))
+  expected$"d" <- list(listToStruct(list(a = 1L, c = 3L)),
+                      listToStruct(list(a = 4L, c = 6L)))
   expect_equal(result, expected)
 
-  result <- collect(select(df, struct(df$a, df$b)))
+  result <- collect(select(df, alias(struct(df$a, df$b), "d")))
   expected <- data.frame(row.names = 1:2)
-  expected$"struct(a, b)" <- list(listToStruct(list(a = 1L, b = 2L)),
-                                 listToStruct(list(a = 4L, b = 5L)))
+  expected$"d" <- list(listToStruct(list(a = 1L, b = 2L)),
+                      listToStruct(list(a = 4L, b = 5L)))
   expect_equal(result, expected)
 
   # Test encode(), decode()
author	eyal farago <eyal farago>	2016-11-02 11:12:20 +0100
committer	Herman van Hovell <hvanhovell@databricks.com>	2016-11-02 11:12:20 +0100
commit	f151bd1af8a05d4b6c901ebe6ac0b51a4a1a20df (patch)
tree	ca9c328efdb1cf9961223196b0396800714eb72b /R/pkg
parent	9c8deef64efee20a0ddc9b612f90e77c80aede60 (diff)
download	spark-f151bd1af8a05d4b6c901ebe6ac0b51a4a1a20df.tar.gz spark-f151bd1af8a05d4b6c901ebe6ac0b51a4a1a20df.tar.bz2 spark-f151bd1af8a05d4b6c901ebe6ac0b51a4a1a20df.zip