diff options
author | Xiangrui Meng <meng@databricks.com> | 2015-03-02 17:14:34 -0800 |
---|---|---|
committer | Xiangrui Meng <meng@databricks.com> | 2015-03-02 17:14:34 -0800 |
commit | 2db6a853a53b4c25e35983bc489510abb8a73e1d (patch) | |
tree | 06baffc2fe5b2dbd77bc05f4972ce525a853479f /python | |
parent | e3a88d1104ebdb858f0509f56d7bb536037e5f63 (diff) | |
download | spark-2db6a853a53b4c25e35983bc489510abb8a73e1d.tar.gz spark-2db6a853a53b4c25e35983bc489510abb8a73e1d.tar.bz2 spark-2db6a853a53b4c25e35983bc489510abb8a73e1d.zip |
[SPARK-6121][SQL][MLLIB] simpleString for UDT
`df.dtypes` shows `null` for UDTs. This PR uses `udt` by default and `VectorUDT` overwrites it with `vector`.
jkbradley davies
Author: Xiangrui Meng <meng@databricks.com>
Closes #4858 from mengxr/SPARK-6121 and squashes the following commits:
34f0a77 [Xiangrui Meng] simpleString for UDT
Diffstat (limited to 'python')
-rw-r--r-- | python/pyspark/mllib/linalg.py | 3 | ||||
-rw-r--r-- | python/pyspark/sql/types.py | 2 |
2 files changed, 4 insertions, 1 deletions
diff --git a/python/pyspark/mllib/linalg.py b/python/pyspark/mllib/linalg.py index 597012b1c9..f5aad28afd 100644 --- a/python/pyspark/mllib/linalg.py +++ b/python/pyspark/mllib/linalg.py @@ -152,6 +152,9 @@ class VectorUDT(UserDefinedType): else: raise ValueError("do not recognize type %r" % tpe) + def simpleString(self): + return "vector" + class Vector(object): diff --git a/python/pyspark/sql/types.py b/python/pyspark/sql/types.py index 31a861e1fe..0169028ccc 100644 --- a/python/pyspark/sql/types.py +++ b/python/pyspark/sql/types.py @@ -468,7 +468,7 @@ class UserDefinedType(DataType): raise NotImplementedError("UDT must implement deserialize().") def simpleString(self): - return 'null' + return 'udt' def json(self): return json.dumps(self.jsonValue(), separators=(',', ':'), sort_keys=True) |