[SPARK-16792][SQL] Dataset containing a Case Class with a List type causes a CompileException (converting sequence to list) - spark

diff options

author	Michal Senkyr <mike.senkyr@gmail.com>	2017-01-06 15:05:20 +0800
committer	Wenchen Fan <wenchen@databricks.com>	2017-01-06 15:05:20 +0800
commit	903bb8e8a2b84b9ea82acbb8ae9d58754862be3a (patch)
tree	1df577fa49e4fd3400920234cc79865f40fbebdc /docs
parent	bcc510b021391035abe6d07c5b82bb0f0be31167 (diff)
download	spark-903bb8e8a2b84b9ea82acbb8ae9d58754862be3a.tar.gz spark-903bb8e8a2b84b9ea82acbb8ae9d58754862be3a.tar.bz2 spark-903bb8e8a2b84b9ea82acbb8ae9d58754862be3a.zip

[SPARK-16792][SQL] Dataset containing a Case Class with a List type causes a CompileException (converting sequence to list)

## What changes were proposed in this pull request? Added a `to` call at the end of the code generated by `ScalaReflection.deserializerFor` if the requested type is not a supertype of `WrappedArray[_]` that uses `CanBuildFrom[_, _, _]` to convert result into an arbitrary subtype of `Seq[_]`. Care was taken to preserve the original deserialization where it is possible to avoid the overhead of conversion in cases where it is not needed `ScalaReflection.serializerFor` could already be used to serialize any `Seq[_]` so it was not altered `SQLImplicits` had to be altered and new implicit encoders added to permit serialization of other sequence types Also fixes [SPARK-16815] Dataset[List[T]] leads to ArrayStoreException ## How was this patch tested? ```bash ./build/mvn -DskipTests clean package && ./dev/run-tests ``` Also manual execution of the following sets of commands in the Spark shell: ```scala case class TestCC(key: Int, letters: List[String]) val ds1 = sc.makeRDD(Seq( (List("D")), (List("S","H")), (List("F","H")), (List("D","L","L")) )).map(x=>(x.length,x)).toDF("key","letters").as[TestCC] val test1=ds1.map{_.key} test1.show ``` ```scala case class X(l: List[String]) spark.createDataset(Seq(List("A"))).map(X).show ``` ```scala spark.sqlContext.createDataset(sc.parallelize(List(1) :: Nil)).collect ``` After adding arbitrary sequence support also tested with the following commands: ```scala case class QueueClass(q: scala.collection.immutable.Queue[Int]) spark.createDataset(Seq(List(1,2,3))).map(x => QueueClass(scala.collection.immutable.Queue(x: _*))).map(_.q.dequeue).collect ``` Author: Michal Senkyr <mike.senkyr@gmail.com> Closes #16240 from michalsenkyr/sql-caseclass-list-fix.

Diffstat (limited to 'docs')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: