[SPARK-18208][SHUFFLE] Executor OOM due to a growing LongArray in BytesToBytesMap - spark

diff options

author	Jie Xiong <jiexiong@fb.com>	2016-12-07 04:33:30 -0800
committer	Herman van Hovell <hvanhovell@databricks.com>	2016-12-07 04:33:30 -0800
commit	c496d03b5289f7c604661a12af86f6accddcf125 (patch)
tree	5fed7dcfc94bdf3df5b9fb6f8b63c7dcd45cf699 /sql
parent	79f5f281bb69cb2de9f64006180abd753e8ae427 (diff)
download	spark-c496d03b5289f7c604661a12af86f6accddcf125.tar.gz spark-c496d03b5289f7c604661a12af86f6accddcf125.tar.bz2 spark-c496d03b5289f7c604661a12af86f6accddcf125.zip

[SPARK-18208][SHUFFLE] Executor OOM due to a growing LongArray in BytesToBytesMap

## What changes were proposed in this pull request? BytesToBytesMap currently does not release the in-memory storage (the longArray variable) after it spills to disk. This is typically not a problem during aggregation because the longArray should be much smaller than the pages, and because we grow the longArray at a conservative rate. However this can lead to an OOM when an already running task is allocated more than its fair share, this can happen because of a scheduling delay. In this case the longArray can grow beyond the fair share of memory for the task. This becomes problematic when the task spills and the long array is not freed, that causes subsequent memory allocation requests to be denied by the memory manager resulting in an OOM. This PR fixes this issuing by freeing the longArray when the BytesToBytesMap spills. ## How was this patch tested? Existing tests and tested on realworld workloads. Author: Jie Xiong <jiexiong@fb.com> Author: jiexiong <jiexiong@gmail.com> Closes #15722 from jiexiong/jie_oom_fix.

Diffstat (limited to 'sql')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: