aboutsummaryrefslogtreecommitdiff
path: root/docs/graphx-programming-guide.md
diff options
context:
space:
mode:
authorJosh Rosen <joshrosen@databricks.com>2014-11-19 16:50:21 -0800
committerJosh Rosen <joshrosen@databricks.com>2014-11-19 16:50:21 -0800
commit04d462f648aba7b18fc293b7189b86af70e421bc (patch)
treed5816c007919740531c942b23a5f99e23cc7c3a6 /docs/graphx-programming-guide.md
parentc3002c4a61c4fc5b966aa384c41c3cba33de0aa6 (diff)
downloadspark-04d462f648aba7b18fc293b7189b86af70e421bc.tar.gz
spark-04d462f648aba7b18fc293b7189b86af70e421bc.tar.bz2
spark-04d462f648aba7b18fc293b7189b86af70e421bc.zip
[SPARK-4495] Fix memory leak in JobProgressListener
This commit fixes a memory leak in JobProgressListener that I introduced in SPARK-2321 and adds a testing framework to ensure that it’s very difficult to inadvertently introduce new memory leaks. This solution might be overkill, but the main idea is to partition JobProgressListener's state into three buckets: collections that should be empty once Spark is idle, collections that must obey some hard size limit, and collections that have a soft size limit (they can grow arbitrarily large when Spark is active but must shrink to fit within some bound after Spark becomes idle). Based on this, we can write fairly generic tests that run workloads that submit more than `spark.ui.retainedStages` stages and `spark.ui.retainedJobs` jobs then check that these various collections' sizes obey their contracts. Author: Josh Rosen <joshrosen@databricks.com> Closes #3372 from JoshRosen/SPARK-4495 and squashes the following commits: c73fab5 [Josh Rosen] "data structures" -> collections be72e81 [Josh Rosen] [SPARK-4495] Fix memory leaks in JobProgressListener
Diffstat (limited to 'docs/graphx-programming-guide.md')
0 files changed, 0 insertions, 0 deletions