[SPARK-12439][SQL] Fix toCatalystArray and MapObjects - spark

diff options

author	Liang-Chi Hsieh <viirya@gmail.com>	2016-01-05 12:33:21 -0800
committer	Michael Armbrust <michael@databricks.com>	2016-01-05 12:33:21 -0800
commit	d202ad2fc24b54de38ad7bfb646bf7703069e9f7 (patch)
tree	dd8768195bd3d5a699c597b6ab0d29c0c41dea66 /python/pyspark
parent	8ce645d4eeda203cf5e100c4bdba2d71edd44e6a (diff)
download	spark-d202ad2fc24b54de38ad7bfb646bf7703069e9f7.tar.gz spark-d202ad2fc24b54de38ad7bfb646bf7703069e9f7.tar.bz2 spark-d202ad2fc24b54de38ad7bfb646bf7703069e9f7.zip

[SPARK-12439][SQL] Fix toCatalystArray and MapObjects

JIRA: https://issues.apache.org/jira/browse/SPARK-12439 In toCatalystArray, we should look at the data type returned by dataTypeFor instead of silentSchemaFor, to determine if the element is native type. An obvious problem is when the element is Option[Int] class, catalsilentSchemaFor will return Int, then we will wrongly recognize the element is native type. There is another problem when using Option as array element. When we encode data like Seq(Some(1), Some(2), None) with encoder, we will use MapObjects to construct an array for it later. But in MapObjects, we don't check if the return value of lambdaFunction is null or not. That causes a bug that the decoded data for Seq(Some(1), Some(2), None) would be Seq(1, 2, -1), instead of Seq(1, 2, null). Author: Liang-Chi Hsieh <viirya@gmail.com> Closes #10391 from viirya/fix-catalystarray.

Diffstat (limited to 'python/pyspark')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: