aboutsummaryrefslogtreecommitdiff
path: root/docs
diff options
context:
space:
mode:
authorMichal Senkyr <mike.senkyr@gmail.com>2017-01-06 15:05:20 +0800
committerWenchen Fan <wenchen@databricks.com>2017-01-06 15:05:20 +0800
commit903bb8e8a2b84b9ea82acbb8ae9d58754862be3a (patch)
tree1df577fa49e4fd3400920234cc79865f40fbebdc /docs
parentbcc510b021391035abe6d07c5b82bb0f0be31167 (diff)
downloadspark-903bb8e8a2b84b9ea82acbb8ae9d58754862be3a.tar.gz
spark-903bb8e8a2b84b9ea82acbb8ae9d58754862be3a.tar.bz2
spark-903bb8e8a2b84b9ea82acbb8ae9d58754862be3a.zip
[SPARK-16792][SQL] Dataset containing a Case Class with a List type causes a CompileException (converting sequence to list)
## What changes were proposed in this pull request? Added a `to` call at the end of the code generated by `ScalaReflection.deserializerFor` if the requested type is not a supertype of `WrappedArray[_]` that uses `CanBuildFrom[_, _, _]` to convert result into an arbitrary subtype of `Seq[_]`. Care was taken to preserve the original deserialization where it is possible to avoid the overhead of conversion in cases where it is not needed `ScalaReflection.serializerFor` could already be used to serialize any `Seq[_]` so it was not altered `SQLImplicits` had to be altered and new implicit encoders added to permit serialization of other sequence types Also fixes [SPARK-16815] Dataset[List[T]] leads to ArrayStoreException ## How was this patch tested? ```bash ./build/mvn -DskipTests clean package && ./dev/run-tests ``` Also manual execution of the following sets of commands in the Spark shell: ```scala case class TestCC(key: Int, letters: List[String]) val ds1 = sc.makeRDD(Seq( (List("D")), (List("S","H")), (List("F","H")), (List("D","L","L")) )).map(x=>(x.length,x)).toDF("key","letters").as[TestCC] val test1=ds1.map{_.key} test1.show ``` ```scala case class X(l: List[String]) spark.createDataset(Seq(List("A"))).map(X).show ``` ```scala spark.sqlContext.createDataset(sc.parallelize(List(1) :: Nil)).collect ``` After adding arbitrary sequence support also tested with the following commands: ```scala case class QueueClass(q: scala.collection.immutable.Queue[Int]) spark.createDataset(Seq(List(1,2,3))).map(x => QueueClass(scala.collection.immutable.Queue(x: _*))).map(_.q.dequeue).collect ``` Author: Michal Senkyr <mike.senkyr@gmail.com> Closes #16240 from michalsenkyr/sql-caseclass-list-fix.
Diffstat (limited to 'docs')
0 files changed, 0 insertions, 0 deletions