diff options
author | hyukjinkwon <gurwls223@gmail.com> | 2017-02-13 13:10:57 -0800 |
---|---|---|
committer | Xiao Li <gatorsmile@gmail.com> | 2017-02-13 13:10:57 -0800 |
commit | 9af8f743b00001f9fdf8813481464c3837331ad9 (patch) | |
tree | c7957cdc679e748eeed6c2e1e9052675a1c7ffc7 /core/src/main | |
parent | 905fdf0c243e1776c54c01a25b17878361400225 (diff) | |
download | spark-9af8f743b00001f9fdf8813481464c3837331ad9.tar.gz spark-9af8f743b00001f9fdf8813481464c3837331ad9.tar.bz2 spark-9af8f743b00001f9fdf8813481464c3837331ad9.zip |
[SPARK-19435][SQL] Type coercion between ArrayTypes
## What changes were proposed in this pull request?
This PR proposes to support type coercion between `ArrayType`s where the element types are compatible.
**Before**
```
Seq(Array(1)).toDF("a").selectExpr("greatest(a, array(1D))")
org.apache.spark.sql.AnalysisException: cannot resolve 'greatest(`a`, array(1.0D))' due to data type mismatch: The expressions should all have the same type, got GREATEST(array<int>, array<double>).; line 1 pos 0;
Seq(Array(1)).toDF("a").selectExpr("least(a, array(1D))")
org.apache.spark.sql.AnalysisException: cannot resolve 'least(`a`, array(1.0D))' due to data type mismatch: The expressions should all have the same type, got LEAST(array<int>, array<double>).; line 1 pos 0;
sql("SELECT * FROM values (array(0)), (array(1D)) as data(a)")
org.apache.spark.sql.AnalysisException: incompatible types found in column a for inline table; line 1 pos 14
Seq(Array(1)).toDF("a").union(Seq(Array(1D)).toDF("b"))
org.apache.spark.sql.AnalysisException: Union can only be performed on tables with the compatible column types. ArrayType(DoubleType,false) <> ArrayType(IntegerType,false) at the first column of the second table;;
sql("SELECT IF(1=1, array(1), array(1D))")
org.apache.spark.sql.AnalysisException: cannot resolve '(IF((1 = 1), array(1), array(1.0D)))' due to data type mismatch: differing types in '(IF((1 = 1), array(1), array(1.0D)))' (array<int> and array<double>).; line 1 pos 7;
```
**After**
```scala
Seq(Array(1)).toDF("a").selectExpr("greatest(a, array(1D))")
res5: org.apache.spark.sql.DataFrame = [greatest(a, array(1.0)): array<double>]
Seq(Array(1)).toDF("a").selectExpr("least(a, array(1D))")
res6: org.apache.spark.sql.DataFrame = [least(a, array(1.0)): array<double>]
sql("SELECT * FROM values (array(0)), (array(1D)) as data(a)")
res8: org.apache.spark.sql.DataFrame = [a: array<double>]
Seq(Array(1)).toDF("a").union(Seq(Array(1D)).toDF("b"))
res10: org.apache.spark.sql.Dataset[org.apache.spark.sql.Row] = [a: array<double>]
sql("SELECT IF(1=1, array(1), array(1D))")
res15: org.apache.spark.sql.DataFrame = [(IF((1 = 1), array(1), array(1.0))): array<double>]
```
## How was this patch tested?
Unit tests in `TypeCoercion` and Jenkins tests and
building with scala 2.10
```scala
./dev/change-scala-version.sh 2.10
./build/mvn -Pyarn -Phadoop-2.4 -Dscala-2.10 -DskipTests clean package
```
Author: hyukjinkwon <gurwls223@gmail.com>
Closes #16777 from HyukjinKwon/SPARK-19435.
Diffstat (limited to 'core/src/main')
0 files changed, 0 insertions, 0 deletions