diff options
author | Takeshi Yamamuro <yamamuro@apache.org> | 2017-02-23 16:28:36 +0100 |
---|---|---|
committer | Herman van Hovell <hvanhovell@databricks.com> | 2017-02-23 16:28:36 +0100 |
commit | 93aa4271596a30752dc5234d869c3ae2f6e8e723 (patch) | |
tree | 4d827436fac4170945dc64a43985055757d1cdfc /docs/ml-classification-regression.md | |
parent | 769aa0f1d22d3c6d4c7871468344d82c8dc36260 (diff) | |
download | spark-93aa4271596a30752dc5234d869c3ae2f6e8e723.tar.gz spark-93aa4271596a30752dc5234d869c3ae2f6e8e723.tar.bz2 spark-93aa4271596a30752dc5234d869c3ae2f6e8e723.zip |
[SPARK-19691][SQL] Fix ClassCastException when calculating percentile of decimal column
## What changes were proposed in this pull request?
This pr fixed a class-cast exception below;
```
scala> spark.range(10).selectExpr("cast (id as decimal) as x").selectExpr("percentile(x, 0.5)").collect()
java.lang.ClassCastException: org.apache.spark.sql.types.Decimal cannot be cast to java.lang.Number
at org.apache.spark.sql.catalyst.expressions.aggregate.Percentile.update(Percentile.scala:141)
at org.apache.spark.sql.catalyst.expressions.aggregate.Percentile.update(Percentile.scala:58)
at org.apache.spark.sql.catalyst.expressions.aggregate.TypedImperativeAggregate.update(interfaces.scala:514)
at org.apache.spark.sql.execution.aggregate.AggregationIterator$$anonfun$1$$anonfun$applyOrElse$1.apply(AggregationIterator.scala:171)
at org.apache.spark.sql.execution.aggregate.AggregationIterator$$anonfun$1$$anonfun$applyOrElse$1.apply(AggregationIterator.scala:171)
at org.apache.spark.sql.execution.aggregate.AggregationIterator$$anonfun$generateProcessRow$1.apply(AggregationIterator.scala:187)
at org.apache.spark.sql.execution.aggregate.AggregationIterator$$anonfun$generateProcessRow$1.apply(AggregationIterator.scala:181)
at org.apache.spark.sql.execution.aggregate.ObjectAggregationIterator.processInputs(ObjectAggregationIterator.scala:151)
at org.apache.spark.sql.execution.aggregate.ObjectAggregationIterator.<init>(ObjectAggregationIterator.scala:78)
at org.apache.spark.sql.execution.aggregate.ObjectHashAggregateExec$$anonfun$doExecute$1$$anonfun$2.apply(ObjectHashAggregateExec.scala:109)
at
```
This fix simply converts catalyst values (i.e., `Decimal`) into scala ones by using `CatalystTypeConverters`.
## How was this patch tested?
Added a test in `DataFrameSuite`.
Author: Takeshi Yamamuro <yamamuro@apache.org>
Closes #17028 from maropu/SPARK-19691.
Diffstat (limited to 'docs/ml-classification-regression.md')
0 files changed, 0 insertions, 0 deletions