SPARK-1255: Allow user to pass Serializer object instead of class name for shuffle. - spark

diff options

author	Reynold Xin <rxin@apache.org>	2014-03-16 09:57:21 -0700
committer	Patrick Wendell <pwendell@gmail.com>	2014-03-16 09:57:21 -0700
commit	f5486e9f75d62919583da5ecf9a9ad00222b2227 (patch)
tree	42bde2b308647eeaef2c7a92aad176916d884310 /python/pyspark/rdd.py
parent	97e4459e1e4cca8696535e10a91733c15f960107 (diff)
download	spark-f5486e9f75d62919583da5ecf9a9ad00222b2227.tar.gz spark-f5486e9f75d62919583da5ecf9a9ad00222b2227.tar.bz2 spark-f5486e9f75d62919583da5ecf9a9ad00222b2227.zip

SPARK-1255: Allow user to pass Serializer object instead of class name for shuffle.

This is more general than simply passing a string name and leaves more room for performance optimizations. Note that this is technically an API breaking change in the following two ways: 1. The shuffle serializer specification in ShuffleDependency now require an object instead of a String (of the class name), but I suspect nobody else in this world has used this API other than me in GraphX and Shark. 2. Serializer's in Spark from now on are required to be serializable. Author: Reynold Xin <rxin@apache.org> Closes #149 from rxin/serializer and squashes the following commits: 5acaccd [Reynold Xin] Properly call serializer's constructors. 2a8d75a [Reynold Xin] Added more documentation for the serializer option in ShuffleDependency. 7420185 [Reynold Xin] Allow user to pass Serializer object instead of class name for shuffle.

Diffstat (limited to 'python/pyspark/rdd.py')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: