aboutsummaryrefslogtreecommitdiff
path: root/data/mllib
diff options
context:
space:
mode:
authorpierre-borckmans <pierre.borckmans@realimpactanalytics.com>2015-12-22 23:00:42 -0800
committerReynold Xin <rxin@databricks.com>2015-12-22 23:00:42 -0800
commit43b2a6390087b7ce262a54dc8ab8dd825db62e21 (patch)
tree958bb0b86a5d040d4064d53786824274193cebd6 /data/mllib
parent50301c0a28b64c5348b0f2c2d828589c0833c70c (diff)
downloadspark-43b2a6390087b7ce262a54dc8ab8dd825db62e21.tar.gz
spark-43b2a6390087b7ce262a54dc8ab8dd825db62e21.tar.bz2
spark-43b2a6390087b7ce262a54dc8ab8dd825db62e21.zip
[SPARK-12477][SQL] - Tungsten projection fails for null values in array fields
Accessing null elements in an array field fails when tungsten is enabled. It works in Spark 1.3.1, and in Spark > 1.5 with Tungsten disabled. This PR solves this by checking if the accessed element in the array field is null, in the generated code. Example: ``` // Array of String case class AS( as: Seq[String] ) val dfAS = sc.parallelize( Seq( AS ( Seq("a",null,"b") ) ) ).toDF dfAS.registerTempTable("T_AS") for (i <- 0 to 2) { println(i + " = " + sqlContext.sql(s"select as[$i] from T_AS").collect.mkString(","))} ``` With Tungsten disabled: ``` 0 = [a] 1 = [null] 2 = [b] ``` With Tungsten enabled: ``` 0 = [a] 15/12/22 09:32:50 ERROR Executor: Exception in task 7.0 in stage 1.0 (TID 15) java.lang.NullPointerException at org.apache.spark.sql.catalyst.expressions.UnsafeRowWriters$UTF8StringWriter.getSize(UnsafeRowWriters.java:90) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown Source) at org.apache.spark.sql.execution.TungstenProject$$anonfun$3$$anonfun$apply$3.apply(basicOperators.scala:90) at org.apache.spark.sql.execution.TungstenProject$$anonfun$3$$anonfun$apply$3.apply(basicOperators.scala:88) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at scala.collection.Iterator$class.foreach(Iterator.scala:727) at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) ``` Author: pierre-borckmans <pierre.borckmans@realimpactanalytics.com> Closes #10429 from pierre-borckmans/SPARK-12477_Tungsten-Projection-Null-Element-In-Array.
Diffstat (limited to 'data/mllib')
0 files changed, 0 insertions, 0 deletions