aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorXiangrui Meng <meng@databricks.com>2015-03-02 17:14:34 -0800
committerXiangrui Meng <meng@databricks.com>2015-03-02 17:14:43 -0800
commit1b8ab5752fccbc08c3f76c50bc384b89231d0a78 (patch)
treeeed9b1b3ffc2e229250f6ce717cd7fc1094e09c3
parentea69cf28e6874d205fca70872a637547407bc08b (diff)
downloadspark-1b8ab5752fccbc08c3f76c50bc384b89231d0a78.tar.gz
spark-1b8ab5752fccbc08c3f76c50bc384b89231d0a78.tar.bz2
spark-1b8ab5752fccbc08c3f76c50bc384b89231d0a78.zip
[SPARK-6121][SQL][MLLIB] simpleString for UDT
`df.dtypes` shows `null` for UDTs. This PR uses `udt` by default and `VectorUDT` overwrites it with `vector`. jkbradley davies Author: Xiangrui Meng <meng@databricks.com> Closes #4858 from mengxr/SPARK-6121 and squashes the following commits: 34f0a77 [Xiangrui Meng] simpleString for UDT (cherry picked from commit 2db6a853a53b4c25e35983bc489510abb8a73e1d) Signed-off-by: Xiangrui Meng <meng@databricks.com>
-rw-r--r--python/pyspark/mllib/linalg.py3
-rw-r--r--python/pyspark/sql/types.py2
2 files changed, 4 insertions, 1 deletions
diff --git a/python/pyspark/mllib/linalg.py b/python/pyspark/mllib/linalg.py
index 597012b1c9..f5aad28afd 100644
--- a/python/pyspark/mllib/linalg.py
+++ b/python/pyspark/mllib/linalg.py
@@ -152,6 +152,9 @@ class VectorUDT(UserDefinedType):
else:
raise ValueError("do not recognize type %r" % tpe)
+ def simpleString(self):
+ return "vector"
+
class Vector(object):
diff --git a/python/pyspark/sql/types.py b/python/pyspark/sql/types.py
index 31a861e1fe..0169028ccc 100644
--- a/python/pyspark/sql/types.py
+++ b/python/pyspark/sql/types.py
@@ -468,7 +468,7 @@ class UserDefinedType(DataType):
raise NotImplementedError("UDT must implement deserialize().")
def simpleString(self):
- return 'null'
+ return 'udt'
def json(self):
return json.dumps(self.jsonValue(), separators=(',', ':'), sort_keys=True)