aboutsummaryrefslogtreecommitdiff
path: root/python/pyspark/ml/wrapper.py
diff options
context:
space:
mode:
authorWenchen Fan <wenchen@databricks.com>2016-12-01 11:53:12 -0800
committerHerman van Hovell <hvanhovell@databricks.com>2016-12-01 11:53:12 -0800
commite6534847100670a22b3b191a0f9d924fab7f3c02 (patch)
treedc554b41efc3d3b68cb109b8126bf1e023167281 /python/pyspark/ml/wrapper.py
parent2ab8551e79e1655c406c358b21c0a1e719f498be (diff)
downloadspark-e6534847100670a22b3b191a0f9d924fab7f3c02.tar.gz
spark-e6534847100670a22b3b191a0f9d924fab7f3c02.tar.bz2
spark-e6534847100670a22b3b191a0f9d924fab7f3c02.zip
[SPARK-18674][SQL] improve the error message of using join
## What changes were proposed in this pull request? The current error message of USING join is quite confusing, for example: ``` scala> val df1 = List(1,2,3).toDS.withColumnRenamed("value", "c1") df1: org.apache.spark.sql.DataFrame = [c1: int] scala> val df2 = List(1,2,3).toDS.withColumnRenamed("value", "c2") df2: org.apache.spark.sql.DataFrame = [c2: int] scala> df1.join(df2, usingColumn = "c1") org.apache.spark.sql.AnalysisException: using columns ['c1] can not be resolved given input columns: [c1, c2] ;; 'Join UsingJoin(Inner,List('c1)) :- Project [value#1 AS c1#3] : +- LocalRelation [value#1] +- Project [value#7 AS c2#9] +- LocalRelation [value#7] ``` after this PR, it becomes: ``` scala> val df1 = List(1,2,3).toDS.withColumnRenamed("value", "c1") df1: org.apache.spark.sql.DataFrame = [c1: int] scala> val df2 = List(1,2,3).toDS.withColumnRenamed("value", "c2") df2: org.apache.spark.sql.DataFrame = [c2: int] scala> df1.join(df2, usingColumn = "c1") org.apache.spark.sql.AnalysisException: USING column `c1` can not be resolved with the right join side, the right output is: [c2]; ``` ## How was this patch tested? updated tests Author: Wenchen Fan <wenchen@databricks.com> Closes #16100 from cloud-fan/natural.
Diffstat (limited to 'python/pyspark/ml/wrapper.py')
0 files changed, 0 insertions, 0 deletions