diff options
author | Xiangrui Meng <meng@databricks.com> | 2015-03-02 17:14:34 -0800 |
---|---|---|
committer | Xiangrui Meng <meng@databricks.com> | 2015-03-02 17:14:43 -0800 |
commit | 1b8ab5752fccbc08c3f76c50bc384b89231d0a78 (patch) | |
tree | eed9b1b3ffc2e229250f6ce717cd7fc1094e09c3 /python | |
parent | ea69cf28e6874d205fca70872a637547407bc08b (diff) | |
download | spark-1b8ab5752fccbc08c3f76c50bc384b89231d0a78.tar.gz spark-1b8ab5752fccbc08c3f76c50bc384b89231d0a78.tar.bz2 spark-1b8ab5752fccbc08c3f76c50bc384b89231d0a78.zip |
[SPARK-6121][SQL][MLLIB] simpleString for UDT
`df.dtypes` shows `null` for UDTs. This PR uses `udt` by default and `VectorUDT` overwrites it with `vector`.
jkbradley davies
Author: Xiangrui Meng <meng@databricks.com>
Closes #4858 from mengxr/SPARK-6121 and squashes the following commits:
34f0a77 [Xiangrui Meng] simpleString for UDT
(cherry picked from commit 2db6a853a53b4c25e35983bc489510abb8a73e1d)
Signed-off-by: Xiangrui Meng <meng@databricks.com>
Diffstat (limited to 'python')
-rw-r--r-- | python/pyspark/mllib/linalg.py | 3 | ||||
-rw-r--r-- | python/pyspark/sql/types.py | 2 |
2 files changed, 4 insertions, 1 deletions
diff --git a/python/pyspark/mllib/linalg.py b/python/pyspark/mllib/linalg.py index 597012b1c9..f5aad28afd 100644 --- a/python/pyspark/mllib/linalg.py +++ b/python/pyspark/mllib/linalg.py @@ -152,6 +152,9 @@ class VectorUDT(UserDefinedType): else: raise ValueError("do not recognize type %r" % tpe) + def simpleString(self): + return "vector" + class Vector(object): diff --git a/python/pyspark/sql/types.py b/python/pyspark/sql/types.py index 31a861e1fe..0169028ccc 100644 --- a/python/pyspark/sql/types.py +++ b/python/pyspark/sql/types.py @@ -468,7 +468,7 @@ class UserDefinedType(DataType): raise NotImplementedError("UDT must implement deserialize().") def simpleString(self): - return 'null' + return 'udt' def json(self): return json.dumps(self.jsonValue(), separators=(',', ':'), sort_keys=True) |