diff options
author | Andre Schumacher <schumach@icsi.berkeley.edu> | 2013-10-04 11:56:47 -0700 |
---|---|---|
committer | Andre Schumacher <schumach@icsi.berkeley.edu> | 2013-10-04 11:56:47 -0700 |
commit | c84946fe210069259f5d42ab8fd22a5ddae91d12 (patch) | |
tree | c57a9c9887f3a26334faa538b830880b8a035885 /python/pyspark/serializers.py | |
parent | 232765f7b26d933caa14e0e1bc0e4937dae90523 (diff) | |
download | spark-c84946fe210069259f5d42ab8fd22a5ddae91d12.tar.gz spark-c84946fe210069259f5d42ab8fd22a5ddae91d12.tar.bz2 spark-c84946fe210069259f5d42ab8fd22a5ddae91d12.zip |
Fixing SPARK-602: PythonPartitioner
Currently PythonPartitioner determines partition ID by hashing a
byte-array representation of PySpark's key. This PR lets
PythonPartitioner use the actual partition ID, which is required e.g.
for sorting via PySpark.
Diffstat (limited to 'python/pyspark/serializers.py')
-rw-r--r-- | python/pyspark/serializers.py | 4 |
1 files changed, 4 insertions, 0 deletions
diff --git a/python/pyspark/serializers.py b/python/pyspark/serializers.py index fecacd1241..54fed1c9c7 100644 --- a/python/pyspark/serializers.py +++ b/python/pyspark/serializers.py @@ -67,6 +67,10 @@ def write_long(value, stream): stream.write(struct.pack("!q", value)) +def pack_long(value): + return struct.pack("!q", value) + + def read_int(stream): length = stream.read(4) if length == "": |