[SPARK-2951] [PySpark] support unpickle array.array for Python 2.6

Pyrolite can not unpickle array.array which pickled by Python 2.6, this patch fix it by extend Pyrolite. There is a bug in Pyrolite when unpickle array of float/double, this patch workaround it by reverse the endianness for float/double. This workaround should be removed after Pyrolite have a new release to fix this issue. I had send an PR to Pyrolite to fix it: https://github.com/irmen/Pyrolite/pull/11 Author: Davies Liu <davies.liu@gmail.com> Closes #2365 from davies/pickle and squashes the following commits: f44f771 [Davies Liu] enable tests about array 3908f5c [Davies Liu] Merge branch 'master' into pickle c77c87b [Davies Liu] cleanup debugging code 60e4e2f [Davies Liu] support unpickle array.array for Python 2.6
author: Davies Liu <davies.liu@gmail.com> 2014-09-15 18:57:25 -0700
committer: Josh Rosen <joshrosen@apache.org> 2014-09-15 18:57:25 -0700
commit: da33acb8b681eca5e787d546fe922af76a151398 (patch)
tree: 1160991b887d207efc3cf3fcad786de8811570e2 /python
parent: fdb302f49c021227026909bdcdade7496059013f (diff)
download: spark-da33acb8b681eca5e787d546fe922af76a151398.tar.gz
spark-da33acb8b681eca5e787d546fe922af76a151398.tar.bz2
spark-da33acb8b681eca5e787d546fe922af76a151398.zip
2 files changed, 1 insertions, 2 deletions
diff --git a/python/pyspark/context.py b/python/pyspark/context.py
index 3ab98e262d..ea28e8cd8c 100644
--- a/python/pyspark/context.py
+++ b/python/pyspark/context.py
@@ -214,6 +214,7 @@ class SparkContext(object):
                 SparkContext._gateway = gateway or launch_gateway()
                 SparkContext._jvm = SparkContext._gateway.jvm
                 SparkContext._writeToFile = SparkContext._jvm.PythonRDD.writeToFile
+                SparkContext._jvm.SerDeUtil.initialize()
 
             if instance:
                 if (SparkContext._active_spark_context and
diff --git a/python/pyspark/tests.py b/python/pyspark/tests.py
index f3309a20fc..f255b44359 100644
--- a/python/pyspark/tests.py
+++ b/python/pyspark/tests.py
@@ -956,8 +956,6 @@ class TestOutputFormat(PySparkTestCase):
             conf=input_conf).collect())
         self.assertEqual(new_dataset, data)
 
-    @unittest.skipIf(sys.version_info[:2] <= (2, 6) or python_implementation() == "PyPy",
-                     "Skipped on 2.6 and PyPy until SPARK-2951 is fixed")
     def test_newhadoop_with_array(self):
         basepath = self.tempdir.name
         # use custom ArrayWritable types and converters to handle arrays
author	Davies Liu <davies.liu@gmail.com>	2014-09-15 18:57:25 -0700
committer	Josh Rosen <joshrosen@apache.org>	2014-09-15 18:57:25 -0700
commit	da33acb8b681eca5e787d546fe922af76a151398 (patch)
tree	1160991b887d207efc3cf3fcad786de8811570e2 /python
parent	fdb302f49c021227026909bdcdade7496059013f (diff)
download	spark-da33acb8b681eca5e787d546fe922af76a151398.tar.gz spark-da33acb8b681eca5e787d546fe922af76a151398.tar.bz2 spark-da33acb8b681eca5e787d546fe922af76a151398.zip