diff options
author | Wenchen Fan <wenchen@databricks.com> | 2016-12-01 11:53:12 -0800 |
---|---|---|
committer | Herman van Hovell <hvanhovell@databricks.com> | 2016-12-01 11:53:12 -0800 |
commit | e6534847100670a22b3b191a0f9d924fab7f3c02 (patch) | |
tree | dc554b41efc3d3b68cb109b8126bf1e023167281 /sql/core/src/main | |
parent | 2ab8551e79e1655c406c358b21c0a1e719f498be (diff) | |
download | spark-e6534847100670a22b3b191a0f9d924fab7f3c02.tar.gz spark-e6534847100670a22b3b191a0f9d924fab7f3c02.tar.bz2 spark-e6534847100670a22b3b191a0f9d924fab7f3c02.zip |
[SPARK-18674][SQL] improve the error message of using join
## What changes were proposed in this pull request?
The current error message of USING join is quite confusing, for example:
```
scala> val df1 = List(1,2,3).toDS.withColumnRenamed("value", "c1")
df1: org.apache.spark.sql.DataFrame = [c1: int]
scala> val df2 = List(1,2,3).toDS.withColumnRenamed("value", "c2")
df2: org.apache.spark.sql.DataFrame = [c2: int]
scala> df1.join(df2, usingColumn = "c1")
org.apache.spark.sql.AnalysisException: using columns ['c1] can not be resolved given input columns: [c1, c2] ;;
'Join UsingJoin(Inner,List('c1))
:- Project [value#1 AS c1#3]
: +- LocalRelation [value#1]
+- Project [value#7 AS c2#9]
+- LocalRelation [value#7]
```
after this PR, it becomes:
```
scala> val df1 = List(1,2,3).toDS.withColumnRenamed("value", "c1")
df1: org.apache.spark.sql.DataFrame = [c1: int]
scala> val df2 = List(1,2,3).toDS.withColumnRenamed("value", "c2")
df2: org.apache.spark.sql.DataFrame = [c2: int]
scala> df1.join(df2, usingColumn = "c1")
org.apache.spark.sql.AnalysisException: USING column `c1` can not be resolved with the right join side, the right output is: [c2];
```
## How was this patch tested?
updated tests
Author: Wenchen Fan <wenchen@databricks.com>
Closes #16100 from cloud-fan/natural.
Diffstat (limited to 'sql/core/src/main')
-rw-r--r-- | sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala b/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala index fcc02e5eb3..133f633212 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala @@ -776,7 +776,7 @@ class Dataset[T] private[sql]( Join( joined.left, joined.right, - UsingJoin(JoinType(joinType), usingColumns.map(UnresolvedAttribute(_))), + UsingJoin(JoinType(joinType), usingColumns), None) } } |