aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorDongjoon Hyun <dongjoon@apache.org>2016-06-24 17:26:39 -0700
committerDavies Liu <davies.liu@gmail.com>2016-06-24 17:26:39 -0700
commite5d0928e2473d1838ff5420c6a8964557c33135e (patch)
treee66e4037e86d996b8c32de43234fc310264380d7
parent20768dade2fee5dfe967a4629cad477e3d3bce6e (diff)
downloadspark-e5d0928e2473d1838ff5420c6a8964557c33135e.tar.gz
spark-e5d0928e2473d1838ff5420c6a8964557c33135e.tar.bz2
spark-e5d0928e2473d1838ff5420c6a8964557c33135e.zip
[SPARK-16173] [SQL] Can't join describe() of DataFrame in Scala 2.10
## What changes were proposed in this pull request? This PR fixes `DataFrame.describe()` by forcing materialization to make the `Seq` serializable. Currently, `describe()` of DataFrame throws `Task not serializable` Spark exceptions when joining in Scala 2.10. ## How was this patch tested? Manual. (After building with Scala 2.10, test on `bin/spark-shell` and `bin/pyspark`.) Author: Dongjoon Hyun <dongjoon@apache.org> Closes #13900 from dongjoon-hyun/SPARK-16173.
-rw-r--r--sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala3
1 files changed, 2 insertions, 1 deletions
diff --git a/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala b/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
index f1d33c3e5c..85d060639c 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
@@ -1908,7 +1908,8 @@ class Dataset[T] private[sql](
// All columns are string type
val schema = StructType(
StructField("summary", StringType) :: outputCols.map(StructField(_, StringType))).toAttributes
- LocalRelation.fromExternalRows(schema, ret)
+ // `toArray` forces materialization to make the seq serializable
+ LocalRelation.fromExternalRows(schema, ret.toArray.toSeq)
}
/**