aboutsummaryrefslogtreecommitdiff
path: root/project
diff options
context:
space:
mode:
authorJosh Rosen <joshrosen@databricks.com>2014-11-19 16:50:21 -0800
committerJosh Rosen <joshrosen@databricks.com>2014-11-19 16:50:44 -0800
commita7c64cc8f939b6c777e296f775d68fb7088a7530 (patch)
treee760e66bbe5ad2fe04b1ef81b6e4629fc9be5dbd /project
parenta250ca369208b23503d7fff1cf9ee52e2e1ba3e2 (diff)
downloadspark-a7c64cc8f939b6c777e296f775d68fb7088a7530.tar.gz
spark-a7c64cc8f939b6c777e296f775d68fb7088a7530.tar.bz2
spark-a7c64cc8f939b6c777e296f775d68fb7088a7530.zip
[SPARK-4495] Fix memory leak in JobProgressListener
This commit fixes a memory leak in JobProgressListener that I introduced in SPARK-2321 and adds a testing framework to ensure that it’s very difficult to inadvertently introduce new memory leaks. This solution might be overkill, but the main idea is to partition JobProgressListener's state into three buckets: collections that should be empty once Spark is idle, collections that must obey some hard size limit, and collections that have a soft size limit (they can grow arbitrarily large when Spark is active but must shrink to fit within some bound after Spark becomes idle). Based on this, we can write fairly generic tests that run workloads that submit more than `spark.ui.retainedStages` stages and `spark.ui.retainedJobs` jobs then check that these various collections' sizes obey their contracts. Author: Josh Rosen <joshrosen@databricks.com> Closes #3372 from JoshRosen/SPARK-4495 and squashes the following commits: c73fab5 [Josh Rosen] "data structures" -> collections be72e81 [Josh Rosen] [SPARK-4495] Fix memory leaks in JobProgressListener (cherry picked from commit 04d462f648aba7b18fc293b7189b86af70e421bc) Signed-off-by: Josh Rosen <joshrosen@databricks.com>
Diffstat (limited to 'project')
0 files changed, 0 insertions, 0 deletions