aboutsummaryrefslogtreecommitdiff
path: root/python
diff options
context:
space:
mode:
authorJosh Rosen <joshrosen@databricks.com>2015-05-20 16:37:11 -0700
committerJosh Rosen <joshrosen@databricks.com>2015-05-20 16:37:11 -0700
commit7956dd7ab03e1542d89dd94c043f1e5131684199 (patch)
treea753324eb6f10972f914ad5fbab29d97b88c8e26 /python
parent3c434cbfd0d6821e5bcf572be792b787a514018b (diff)
downloadspark-7956dd7ab03e1542d89dd94c043f1e5131684199.tar.gz
spark-7956dd7ab03e1542d89dd94c043f1e5131684199.tar.bz2
spark-7956dd7ab03e1542d89dd94c043f1e5131684199.zip
[SPARK-7698] Cache and reuse buffers in ExecutorMemoryAllocator when using heap allocation
When on-heap memory allocation is used, ExecutorMemoryManager should maintain a cache / pool of buffers for re-use by tasks. This will significantly improve the performance of the new Tungsten's sort-shuffle for jobs with many short-lived tasks by eliminating a major source of GC. This pull request is a minimum-viable-implementation of this idea. In its current form, this patch significantly improves performance on a stress test which launches huge numbers of short-lived shuffle map tasks back-to-back in the same JVM. Author: Josh Rosen <joshrosen@databricks.com> Closes #6227 from JoshRosen/SPARK-7698 and squashes the following commits: fd6cb55 [Josh Rosen] SoftReference -> WeakReference b154e86 [Josh Rosen] WIP sketch of pooling in ExecutorMemoryManager
Diffstat (limited to 'python')
0 files changed, 0 insertions, 0 deletions