aboutsummaryrefslogtreecommitdiff
path: root/docs/python-programming-guide.md
diff options
context:
space:
mode:
authorMatei Zaharia <matei@eecs.berkeley.edu>2013-07-29 00:09:11 -0400
committerMatei Zaharia <matei@eecs.berkeley.edu>2013-07-29 02:51:43 -0400
commitfeba7ee540fca28872957120e5e39b9e36466953 (patch)
treec4349aa082e6727f638bc360ba6d9352a88959bc /docs/python-programming-guide.md
parentd75c3086951f603ec30b2527c24559e053ed7f25 (diff)
downloadspark-feba7ee540fca28872957120e5e39b9e36466953.tar.gz
spark-feba7ee540fca28872957120e5e39b9e36466953.tar.bz2
spark-feba7ee540fca28872957120e5e39b9e36466953.zip
SPARK-815. Python parallelize() should split lists before batching
One unfortunate consequence of this fix is that we materialize any collections that are given to us as generators, but this seems necessary to get reasonable behavior on small collections. We could add a batchSize parameter later to bypass auto-computation of batch size if this becomes a problem (e.g. if users really want to parallelize big generators nicely)
Diffstat (limited to 'docs/python-programming-guide.md')
0 files changed, 0 insertions, 0 deletions