aboutsummaryrefslogtreecommitdiff
path: root/yarn
diff options
context:
space:
mode:
authorMarcelo Vanzin <vanzin@cloudera.com>2015-05-01 09:50:55 -0500
committerImran Rashid <irashid@cloudera.com>2015-05-01 09:50:55 -0500
commit3052f4916e7f2c7fbc4837f00f4463b7d0b34718 (patch)
tree60796c615223dd96109cfcee74e6978528539425 /yarn
parent7fe0f3f2b46c61a5cc4af9166781624409fda8a4 (diff)
downloadspark-3052f4916e7f2c7fbc4837f00f4463b7d0b34718.tar.gz
spark-3052f4916e7f2c7fbc4837f00f4463b7d0b34718.tar.bz2
spark-3052f4916e7f2c7fbc4837f00f4463b7d0b34718.zip
[SPARK-4705] Handle multiple app attempts event logs, history server.
This change modifies the event logging listener to write the logs for different application attempts to different files. The attempt ID is set by the scheduler backend, so as long as the backend returns that ID to SparkContext, things should work. Currently, the YARN backend does that. The history server was also modified to model multiple attempts per application. Each attempt has its own UI and a separate row in the listing table, so that users can look at all the attempts separately. The UI "adapts" itself to avoid showing attempt-specific info when all the applications being shown have a single attempt. Author: Marcelo Vanzin <vanzin@cloudera.com> Author: twinkle sachdeva <twinkle@kite.ggn.in.guavus.com> Author: twinkle.sachdeva <twinkle.sachdeva@guavus.com> Author: twinkle sachdeva <twinkle.sachdeva@guavus.com> Closes #5432 from vanzin/SPARK-4705 and squashes the following commits: 7e289fa [Marcelo Vanzin] Review feedback. f66dcc5 [Marcelo Vanzin] Merge branch 'master' into SPARK-4705 bc885b7 [Marcelo Vanzin] Review feedback. 76a3651 [Marcelo Vanzin] Fix log cleaner, add test. 7c381ec [Marcelo Vanzin] Merge branch 'master' into SPARK-4705 1aa309d [Marcelo Vanzin] Improve sorting of app attempts. 2ad77e7 [Marcelo Vanzin] Missed a reference to the old property name. 9d59d92 [Marcelo Vanzin] Scalastyle... d5a9c37 [Marcelo Vanzin] Update JsonProtocol test, make property name consistent. ba34b69 [Marcelo Vanzin] Use Option[String] for attempt id. f1cb9b3 [Marcelo Vanzin] Merge branch 'master' into SPARK-4705 c14ec19 [Marcelo Vanzin] Merge branch 'master' into SPARK-4705 9092d39 [Marcelo Vanzin] Merge branch 'master' into SPARK-4705 86de638 [Marcelo Vanzin] Merge branch 'master' into SPARK-4705 07446c6 [Marcelo Vanzin] Disable striping for app id / name when multiple attempts exist. 9092af5 [Marcelo Vanzin] Fix HistoryServer test. 3a14503 [Marcelo Vanzin] Argh scalastyle. 657ec18 [Marcelo Vanzin] Fix yarn history URL, app links. c3e0a82 [Marcelo Vanzin] Move app name to app info, more UI fixes. ce5ee5d [Marcelo Vanzin] Misc UI, test, style fixes. cbe8bba [Marcelo Vanzin] Attempt ID in listener event should be an option. 88b1de8 [Marcelo Vanzin] Add a test for apps with multiple attempts. 3245aa2 [Marcelo Vanzin] Make app attempts part of the history server model. 5fd5c6f [Marcelo Vanzin] Fix my broken rebase. 318525a [twinkle.sachdeva] SPARK-4705: 1) moved from directory structure to single file, as per the master branch. 2) Added the attempt id inside the SparkListenerApplicationStart, to make the info available independent of directory structure. 3) Changes in History Server to render the UI as per the snaphot II 6b2e521 [twinkle sachdeva] SPARK-4705 Incorporating the review comments regarding formatting, will do the rest of the changes after this 4c1fc26 [twinkle sachdeva] SPARK-4705 Incorporating the review comments regarding formatting, will do the rest of the changes after this 0eb7722 [twinkle sachdeva] SPARK-4705: Doing cherry-pick of fix into master
Diffstat (limited to 'yarn')
-rw-r--r--yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala7
-rw-r--r--yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClusterSchedulerBackend.scala12
2 files changed, 15 insertions, 4 deletions
diff --git a/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala b/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala
index 70cb57ffd8..27f804782f 100644
--- a/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala
+++ b/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala
@@ -89,6 +89,10 @@ private[spark] class ApplicationMaster(
// Propagate the application ID so that YarnClusterSchedulerBackend can pick it up.
System.setProperty("spark.yarn.app.id", appAttemptId.getApplicationId().toString())
+
+ // Propagate the attempt if, so that in case of event logging,
+ // different attempt's logs gets created in different directory
+ System.setProperty("spark.yarn.app.attemptId", appAttemptId.getAttemptId().toString())
}
logInfo("ApplicationAttemptId: " + appAttemptId)
@@ -208,10 +212,11 @@ private[spark] class ApplicationMaster(
val sc = sparkContextRef.get()
val appId = client.getAttemptId().getApplicationId().toString()
+ val attemptId = client.getAttemptId().getAttemptId().toString()
val historyAddress =
sparkConf.getOption("spark.yarn.historyServer.address")
.map { text => SparkHadoopUtil.get.substituteHadoopVariables(text, yarnConf) }
- .map { address => s"${address}${HistoryServer.UI_PATH_PREFIX}/${appId}" }
+ .map { address => s"${address}${HistoryServer.UI_PATH_PREFIX}/${appId}/${attemptId}" }
.getOrElse("")
allocator = client.register(yarnConf,
diff --git a/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClusterSchedulerBackend.scala b/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClusterSchedulerBackend.scala
index b1de81e6a8..aeb218a575 100644
--- a/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClusterSchedulerBackend.scala
+++ b/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClusterSchedulerBackend.scala
@@ -39,12 +39,18 @@ private[spark] class YarnClusterSchedulerBackend(
}
override def applicationId(): String =
- // In YARN Cluster mode, spark.yarn.app.id is expect to be set
- // before user application is launched.
- // So, if spark.yarn.app.id is not set, it is something wrong.
+ // In YARN Cluster mode, the application ID is expected to be set, so log an error if it's
+ // not found.
sc.getConf.getOption("spark.yarn.app.id").getOrElse {
logError("Application ID is not set.")
super.applicationId
}
+ override def applicationAttemptId(): Option[String] =
+ // In YARN Cluster mode, the attempt ID is expected to be set, so log an error if it's
+ // not found.
+ sc.getConf.getOption("spark.yarn.app.attemptId").orElse {
+ logError("Application attempt ID is not set.")
+ super.applicationAttemptId
+ }
}