aboutsummaryrefslogtreecommitdiff
path: root/mllib
diff options
context:
space:
mode:
authorXusen Yin <yinxusen@gmail.com>2016-01-25 22:41:52 -0800
committerJoseph K. Bradley <joseph@databricks.com>2016-01-25 22:41:52 -0800
commitae47ba718a280fc12720a71b981c38dbe647f35b (patch)
treedce29f474ab43e90cb7a46e509bab4c77958fee7 /mllib
parentb66afdeb5253913d916dcf159aaed4ffdc15fd4b (diff)
downloadspark-ae47ba718a280fc12720a71b981c38dbe647f35b.tar.gz
spark-ae47ba718a280fc12720a71b981c38dbe647f35b.tar.bz2
spark-ae47ba718a280fc12720a71b981c38dbe647f35b.zip
[SPARK-12834] Change ser/de of JavaArray and JavaList
https://issues.apache.org/jira/browse/SPARK-12834 We use `SerDe.dumps()` to serialize `JavaArray` and `JavaList` in `PythonMLLibAPI`, then deserialize them with `PickleSerializer` in Python side. However, there is no need to transform them in such an inefficient way. Instead of it, we can use type conversion to convert them, e.g. `list(JavaArray)` or `list(JavaList)`. What's more, there is an issue to Ser/De Scala Array as I said in https://issues.apache.org/jira/browse/SPARK-12780 Author: Xusen Yin <yinxusen@gmail.com> Closes #10772 from yinxusen/SPARK-12834.
Diffstat (limited to 'mllib')
-rw-r--r--mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala6
1 files changed, 5 insertions, 1 deletions
diff --git a/mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala b/mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala
index 05f9a76d32..088ec6a0c0 100644
--- a/mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala
+++ b/mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala
@@ -1490,7 +1490,11 @@ private[spark] object SerDe extends Serializable {
initialize()
def dumps(obj: AnyRef): Array[Byte] = {
- new Pickler().dumps(obj)
+ obj match {
+ // Pickler in Python side cannot deserialize Scala Array normally. See SPARK-12834.
+ case array: Array[_] => new Pickler().dumps(array.toSeq.asJava)
+ case _ => new Pickler().dumps(obj)
+ }
}
def loads(bytes: Array[Byte]): AnyRef = {