[SPARK-7902] [SPARK-6289] [SPARK-8685] [SQL] [PYSPARK] Refactor of serialization for Python DataFrame - spark

diff options

author	Davies Liu <davies@databricks.com>	2015-07-09 14:43:38 -0700
committer	Davies Liu <davies.liu@gmail.com>	2015-07-09 14:43:38 -0700
commit	c9e2ef52bb54f35a904427389dc492d61f29b018 (patch)
tree	90887ae7055aa78751561119083bd09ac099e0f4 /external/kafka
parent	3ccebf36c5abe04702d4cf223552a94034d980fb (diff)
download	spark-c9e2ef52bb54f35a904427389dc492d61f29b018.tar.gz spark-c9e2ef52bb54f35a904427389dc492d61f29b018.tar.bz2 spark-c9e2ef52bb54f35a904427389dc492d61f29b018.zip

[SPARK-7902] [SPARK-6289] [SPARK-8685] [SQL] [PYSPARK] Refactor of serialization for Python DataFrame

This PR fix the long standing issue of serialization between Python RDD and DataFrame, it change to using a customized Pickler for InternalRow to enable customized unpickling (type conversion, especially for UDT), now we can support UDT for UDF, cc mengxr . There is no generated `Row` anymore. Author: Davies Liu <davies@databricks.com> Closes #7301 from davies/sql_ser and squashes the following commits: 81bef71 [Davies Liu] address comments e9217bd [Davies Liu] add regression tests db34167 [Davies Liu] Refactor of serialization for Python DataFrame

Diffstat (limited to 'external/kafka')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: