diff options
author | Andrew Ash <andrew@andrewash.com> | 2014-09-05 18:52:05 -0700 |
---|---|---|
committer | Matei Zaharia <matei@databricks.com> | 2014-09-05 18:52:05 -0700 |
commit | ba5bcaddecd54811d45c5fc79a013b3857d4c633 (patch) | |
tree | b5ada50e5a507971c37268f8bc216c588daa8c48 /examples/src/main/python/mllib | |
parent | 7ff8c45d714e0f2315910838b739c0c034672015 (diff) | |
download | spark-ba5bcaddecd54811d45c5fc79a013b3857d4c633.tar.gz spark-ba5bcaddecd54811d45c5fc79a013b3857d4c633.tar.bz2 spark-ba5bcaddecd54811d45c5fc79a013b3857d4c633.zip |
SPARK-3211 .take() is OOM-prone with empty partitions
Instead of jumping straight from 1 partition to all partitions, do exponential
growth and double the number of partitions to attempt each time instead.
Fix proposed by Paul Nepywoda
Author: Andrew Ash <andrew@andrewash.com>
Closes #2117 from ash211/SPARK-3211 and squashes the following commits:
8b2299a [Andrew Ash] Quadruple instead of double for a minor speedup
e5f7e4d [Andrew Ash] Update comment to better reflect what we're doing
09a27f7 [Andrew Ash] Update PySpark to be less OOM-prone as well
3a156b8 [Andrew Ash] SPARK-3211 .take() is OOM-prone with empty partitions
Diffstat (limited to 'examples/src/main/python/mllib')
0 files changed, 0 insertions, 0 deletions