diff options
author | Yin Huai <huai@cse.ohio-state.edu> | 2014-07-07 17:01:44 -0700 |
---|---|---|
committer | Michael Armbrust <michael@databricks.com> | 2014-07-07 17:01:44 -0700 |
commit | c0b4cf097de50eb2c4b0f0e67da53ee92efc1f77 (patch) | |
tree | d46a395edc9983681b83a56f4dbabcc79e9477b0 /sql/hive/src/test/resources/golden | |
parent | f7ce1b3b48f0354434456241188c6a5d954852e2 (diff) | |
download | spark-c0b4cf097de50eb2c4b0f0e67da53ee92efc1f77.tar.gz spark-c0b4cf097de50eb2c4b0f0e67da53ee92efc1f77.tar.bz2 spark-c0b4cf097de50eb2c4b0f0e67da53ee92efc1f77.zip |
[SPARK-2339][SQL] SQL parser in sql-core is case sensitive, but a table alias is converted to lower case when we create Subquery
Reported by http://apache-spark-user-list.1001560.n3.nabble.com/Spark-SQL-Join-throws-exception-td8599.html
After we get the table from the catalog, because the table has an alias, we will temporarily insert a Subquery. Then, we convert the table alias to lower case no matter if the parser is case sensitive or not.
To see the issue ...
```
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
import sqlContext.createSchemaRDD
case class Person(name: String, age: Int)
val people = sc.textFile("examples/src/main/resources/people.txt").map(_.split(",")).map(p => Person(p(0), p(1).trim.toInt))
people.registerAsTable("people")
sqlContext.sql("select PEOPLE.name from people PEOPLE")
```
The plan is ...
```
== Query Plan ==
Project ['PEOPLE.name]
ExistingRdd [name#0,age#1], MapPartitionsRDD[4] at mapPartitions at basicOperators.scala:176
```
You can find that `PEOPLE.name` is not resolved.
This PR introduces three changes.
1. If a table has an alias, the catalog will not lowercase the alias. If a lowercase alias is needed, the analyzer will do the work.
2. A catalog has a new val caseSensitive that indicates if this catalog is case sensitive or not. For example, a SimpleCatalog is case sensitive, but
3. Corresponding unit tests.
With this PR, case sensitivity of database names and table names is handled by the catalog. Case sensitivity of other identifiers are handled by the analyzer.
JIRA: https://issues.apache.org/jira/browse/SPARK-2339
Author: Yin Huai <huai@cse.ohio-state.edu>
Closes #1317 from yhuai/SPARK-2339 and squashes the following commits:
12d8006 [Yin Huai] Handling case sensitivity correctly. This patch introduces three changes. 1. If a table has an alias, the catalog will not lowercase the alias. If a lowercase alias is needed, the analyzer will do the work. 2. A catalog has a new val caseSensitive that indicates if this catalog is case sensitive or not. For example, a SimpleCatalog is case sensitive, but 3. Corresponding unit tests. With this patch, case sensitivity of database names and table names is handled by the catalog. Case sensitivity of other identifiers is handled by the analyzer.
Diffstat (limited to 'sql/hive/src/test/resources/golden')
-rw-r--r-- | sql/hive/src/test/resources/golden/case sensitivity: Hive table-0-5d14d21a239daa42b086cc895215009a | 14 |
1 files changed, 14 insertions, 0 deletions
diff --git a/sql/hive/src/test/resources/golden/case sensitivity: Hive table-0-5d14d21a239daa42b086cc895215009a b/sql/hive/src/test/resources/golden/case sensitivity: Hive table-0-5d14d21a239daa42b086cc895215009a new file mode 100644 index 0000000000..4d7127c0fa --- /dev/null +++ b/sql/hive/src/test/resources/golden/case sensitivity: Hive table-0-5d14d21a239daa42b086cc895215009a @@ -0,0 +1,14 @@ +0 val_0 +4 val_4 +12 val_12 +8 val_8 +0 val_0 +0 val_0 +10 val_10 +5 val_5 +11 val_11 +5 val_5 +2 val_2 +12 val_12 +5 val_5 +9 val_9 |