diff options
author | Dongjoon Hyun <dongjoon@apache.org> | 2016-06-24 17:26:39 -0700 |
---|---|---|
committer | Davies Liu <davies.liu@gmail.com> | 2016-06-24 17:26:39 -0700 |
commit | e5d0928e2473d1838ff5420c6a8964557c33135e (patch) | |
tree | e66e4037e86d996b8c32de43234fc310264380d7 | |
parent | 20768dade2fee5dfe967a4629cad477e3d3bce6e (diff) | |
download | spark-e5d0928e2473d1838ff5420c6a8964557c33135e.tar.gz spark-e5d0928e2473d1838ff5420c6a8964557c33135e.tar.bz2 spark-e5d0928e2473d1838ff5420c6a8964557c33135e.zip |
[SPARK-16173] [SQL] Can't join describe() of DataFrame in Scala 2.10
## What changes were proposed in this pull request?
This PR fixes `DataFrame.describe()` by forcing materialization to make the `Seq` serializable. Currently, `describe()` of DataFrame throws `Task not serializable` Spark exceptions when joining in Scala 2.10.
## How was this patch tested?
Manual. (After building with Scala 2.10, test on `bin/spark-shell` and `bin/pyspark`.)
Author: Dongjoon Hyun <dongjoon@apache.org>
Closes #13900 from dongjoon-hyun/SPARK-16173.
-rw-r--r-- | sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala | 3 |
1 files changed, 2 insertions, 1 deletions
diff --git a/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala b/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala index f1d33c3e5c..85d060639c 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala @@ -1908,7 +1908,8 @@ class Dataset[T] private[sql]( // All columns are string type val schema = StructType( StructField("summary", StringType) :: outputCols.map(StructField(_, StringType))).toAttributes - LocalRelation.fromExternalRows(schema, ret) + // `toArray` forces materialization to make the seq serializable + LocalRelation.fromExternalRows(schema, ret.toArray.toSeq) } /** |