diff options
author | gatorsmile <gatorsmile@gmail.com> | 2016-07-07 12:07:19 +0800 |
---|---|---|
committer | Wenchen Fan <wenchen@databricks.com> | 2016-07-07 12:07:19 +0800 |
commit | 42279bff686f9808ec7a9e8f4da95c717edc6026 (patch) | |
tree | f80d581c70a7442163756e9e8eab56560c4c63c9 /sql/catalyst/src/main/scala | |
parent | 34283de160808324da02964cd5dc5df80e59ae71 (diff) | |
download | spark-42279bff686f9808ec7a9e8f4da95c717edc6026.tar.gz spark-42279bff686f9808ec7a9e8f4da95c717edc6026.tar.bz2 spark-42279bff686f9808ec7a9e8f4da95c717edc6026.zip |
[SPARK-16374][SQL] Remove Alias from MetastoreRelation and SimpleCatalogRelation
#### What changes were proposed in this pull request?
Different from the other leaf nodes, `MetastoreRelation` and `SimpleCatalogRelation` have a pre-defined `alias`, which is used to change the qualifier of the node. However, based on the existing alias handling, alias should be put in `SubqueryAlias`.
This PR is to separate alias handling from `MetastoreRelation` and `SimpleCatalogRelation` to make it consistent with the other nodes. It simplifies the signature and conversion to a `BaseRelation`.
For example, below is an example query for `MetastoreRelation`, which is converted to a `LogicalRelation`:
```SQL
SELECT tmp.a + 1 FROM test_parquet_ctas tmp WHERE tmp.a > 2
```
Before changes, the analyzed plan is
```
== Analyzed Logical Plan ==
(a + 1): int
Project [(a#951 + 1) AS (a + 1)#952]
+- Filter (a#951 > 2)
+- SubqueryAlias tmp
+- Relation[a#951] parquet
```
After changes, the analyzed plan becomes
```
== Analyzed Logical Plan ==
(a + 1): int
Project [(a#951 + 1) AS (a + 1)#952]
+- Filter (a#951 > 2)
+- SubqueryAlias tmp
+- SubqueryAlias test_parquet_ctas
+- Relation[a#951] parquet
```
**Note: the optimized plans are the same.**
For `SimpleCatalogRelation`, the existing code always generates two Subqueries. Thus, no change is needed.
#### How was this patch tested?
Added test cases.
Author: gatorsmile <gatorsmile@gmail.com>
Closes #14053 from gatorsmile/removeAliasFromMetastoreRelation.
Diffstat (limited to 'sql/catalyst/src/main/scala')
-rw-r--r-- | sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala | 2 | ||||
-rw-r--r-- | sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala | 5 |
2 files changed, 3 insertions, 4 deletions
diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala index e1d49912c3..ffaefeb09a 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala @@ -403,7 +403,7 @@ class SessionCatalog( val relation = if (name.database.isDefined || !tempTables.contains(table)) { val metadata = externalCatalog.getTable(db, table) - SimpleCatalogRelation(db, metadata, alias) + SimpleCatalogRelation(db, metadata) } else { tempTables(table) } diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala index 6197acab33..b12606e17d 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala @@ -244,8 +244,7 @@ trait CatalogRelation { */ case class SimpleCatalogRelation( databaseName: String, - metadata: CatalogTable, - alias: Option[String] = None) + metadata: CatalogTable) extends LeafNode with CatalogRelation { override def catalogTable: CatalogTable = metadata @@ -261,7 +260,7 @@ case class SimpleCatalogRelation( CatalystSqlParser.parseDataType(f.dataType), // Since data can be dumped in randomly with no validation, everything is nullable. nullable = true - )(qualifier = Some(alias.getOrElse(metadata.identifier.table))) + )(qualifier = Some(metadata.identifier.table)) } } |