diff options
author | Xiangrui Meng <meng@databricks.com> | 2014-11-03 19:29:11 -0800 |
---|---|---|
committer | Xiangrui Meng <meng@databricks.com> | 2014-11-03 19:29:11 -0800 |
commit | 04450d11548cfb25d4fb77d4a33e3a7cd4254183 (patch) | |
tree | 13b5c6fd1ac8c400cf59a51fe0b84b60c64d400f /python/pyspark/mllib | |
parent | c5912ecc7b392a13089ae735c07c2d7256de36c6 (diff) | |
download | spark-04450d11548cfb25d4fb77d4a33e3a7cd4254183.tar.gz spark-04450d11548cfb25d4fb77d4a33e3a7cd4254183.tar.bz2 spark-04450d11548cfb25d4fb77d4a33e3a7cd4254183.zip |
[SPARK-4192][SQL] Internal API for Python UDT
Following #2919, this PR adds Python UDT (for internal use only) with tests under "pyspark.tests". Before `SQLContext.applySchema`, we check whether we need to convert user-type instances into SQL recognizable data. In the current implementation, a Python UDT must be paired with a Scala UDT for serialization on the JVM side. A following PR will add VectorUDT in MLlib for both Scala and Python.
marmbrus jkbradley davies
Author: Xiangrui Meng <meng@databricks.com>
Closes #3068 from mengxr/SPARK-4192-sql and squashes the following commits:
acff637 [Xiangrui Meng] merge master
dba5ea7 [Xiangrui Meng] only use pyClass for Python UDT output sqlType as well
2c9d7e4 [Xiangrui Meng] move import to global setup; update needsConversion
7c4a6a9 [Xiangrui Meng] address comments
75223db [Xiangrui Meng] minor update
f740379 [Xiangrui Meng] remove UDT from default imports
e98d9d0 [Xiangrui Meng] fix py style
4e84fce [Xiangrui Meng] remove local hive tests and add more tests
39f19e0 [Xiangrui Meng] add tests
b7f666d [Xiangrui Meng] add Python UDT
Diffstat (limited to 'python/pyspark/mllib')
0 files changed, 0 insertions, 0 deletions