diff options
author | Rex Xiong <pengx@microsoft.com> | 2015-05-14 16:55:31 -0700 |
---|---|---|
committer | Andrew Or <andrew@databricks.com> | 2015-05-14 16:55:31 -0700 |
commit | 93dbb3ad83fd60444a38c3dc87a2053c667123af (patch) | |
tree | 0ce7e0b6498f1924bf0b30414531c6bafebe3cd1 | |
parent | 11a1a135d1fe892cd48a9116acc7554846aed84c (diff) | |
download | spark-93dbb3ad83fd60444a38c3dc87a2053c667123af.tar.gz spark-93dbb3ad83fd60444a38c3dc87a2053c667123af.tar.bz2 spark-93dbb3ad83fd60444a38c3dc87a2053c667123af.zip |
[SPARK-7598] [DEPLOY] Add aliveWorkers metrics in Master
In Spark Standalone setup, when some workers are DEAD, they will stay in master worker list for a while.
master.workers metrics for master is only showing the total number of workers, we need to monitor how many real ALIVE workers are there to ensure the cluster is healthy.
Author: Rex Xiong <pengx@microsoft.com>
Closes #6117 from twilightgod/add-aliveWorker-metrics and squashes the following commits:
6be69a5 [Rex Xiong] Fix comment for aliveWorkers metrics
a882f39 [Rex Xiong] Fix style for aliveWorkers metrics
38ce955 [Rex Xiong] Add aliveWorkers metrics in Master
-rw-r--r-- | core/src/main/scala/org/apache/spark/deploy/master/MasterSource.scala | 5 |
1 files changed, 5 insertions, 0 deletions
diff --git a/core/src/main/scala/org/apache/spark/deploy/master/MasterSource.scala b/core/src/main/scala/org/apache/spark/deploy/master/MasterSource.scala index 9c3f79f124..66a9ff3867 100644 --- a/core/src/main/scala/org/apache/spark/deploy/master/MasterSource.scala +++ b/core/src/main/scala/org/apache/spark/deploy/master/MasterSource.scala @@ -30,6 +30,11 @@ private[spark] class MasterSource(val master: Master) extends Source { override def getValue: Int = master.workers.size }) + // Gauge for alive worker numbers in cluster + metricRegistry.register(MetricRegistry.name("aliveWorkers"), new Gauge[Int]{ + override def getValue: Int = master.workers.filter(_.state == WorkerState.ALIVE).size + }) + // Gauge for application numbers in cluster metricRegistry.register(MetricRegistry.name("apps"), new Gauge[Int] { override def getValue: Int = master.apps.size |