Merge pull request #101 from colorant/yarn-client-scheduler

For SPARK-527, Support spark-shell when running on YARN sync to trunk and resubmit here In current YARN mode approaching, the application is run in the Application Master as a user program thus the whole spark context is on remote. This approaching won't support application that involve local interaction and need to be run on where it is launched. So In this pull request I have a YarnClientClusterScheduler and backend added. With this scheduler, the user application is launched locally,While the executor will be launched by YARN on remote nodes with a thin AM which only launch the executor and monitor the Driver Actor status, so that when client app is done, it can finish the YARN Application as well. This enables spark-shell to run upon YARN. This also enable other Spark applications to have the spark context to run locally with a master-url "yarn-client". Thus e.g. SparkPi could have the result output locally on console instead of output in the log of the remote machine where AM is running on. Docs also updated to show how to use this yarn-client mode.
author: Matei Zaharia <matei@eecs.berkeley.edu> 2013-11-25 15:25:29 -0800
committer: Matei Zaharia <matei@eecs.berkeley.edu> 2013-11-25 15:25:29 -0800
commit: eb4296c8f7561aaf8782479dd5cd7c9320b7fa6b (patch)
tree: c132c439562c408d7b69aadf17989209901c8c1b /core
parent: 62889c419cfddb1cea2d260e9b530349d9f8eeda (diff)
parent: ab3cefde5349d0de85b23b49feef493ff0b2d1ed (diff)
download: spark-eb4296c8f7561aaf8782479dd5cd7c9320b7fa6b.tar.gz
spark-eb4296c8f7561aaf8782479dd5cd7c9320b7fa6b.tar.bz2
spark-eb4296c8f7561aaf8782479dd5cd7c9320b7fa6b.zip
1 files changed, 25 insertions, 0 deletions
diff --git a/core/src/main/scala/org/apache/spark/SparkContext.scala b/core/src/main/scala/org/apache/spark/SparkContext.scala
index 42b2985b50..3a80241daa 100644
--- a/core/src/main/scala/org/apache/spark/SparkContext.scala
+++ b/core/src/main/scala/org/apache/spark/SparkContext.scala
@@ -226,6 +226,31 @@ class SparkContext(
         scheduler.initialize(backend)
         scheduler
 
+      case "yarn-client" =>
+        val scheduler = try {
+          val clazz = Class.forName("org.apache.spark.scheduler.cluster.YarnClientClusterScheduler")
+          val cons = clazz.getConstructor(classOf[SparkContext])
+          cons.newInstance(this).asInstanceOf[ClusterScheduler]
+
+        } catch {
+          case th: Throwable => {
+            throw new SparkException("YARN mode not available ?", th)
+          }
+        }
+
+        val backend = try {
+          val clazz = Class.forName("org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend")
+          val cons = clazz.getConstructor(classOf[ClusterScheduler], classOf[SparkContext])
+          cons.newInstance(scheduler, this).asInstanceOf[CoarseGrainedSchedulerBackend]
+        } catch {
+          case th: Throwable => {
+            throw new SparkException("YARN mode not available ?", th)
+          }
+        }
+
+        scheduler.initialize(backend)
+        scheduler
+
       case MESOS_REGEX(mesosUrl) =>
         MesosNativeLibrary.load()
         val scheduler = new ClusterScheduler(this)
author	Matei Zaharia <matei@eecs.berkeley.edu>	2013-11-25 15:25:29 -0800
committer	Matei Zaharia <matei@eecs.berkeley.edu>	2013-11-25 15:25:29 -0800
commit	eb4296c8f7561aaf8782479dd5cd7c9320b7fa6b (patch)
tree	c132c439562c408d7b69aadf17989209901c8c1b /core
parent	62889c419cfddb1cea2d260e9b530349d9f8eeda (diff)
parent	ab3cefde5349d0de85b23b49feef493ff0b2d1ed (diff)
download	spark-eb4296c8f7561aaf8782479dd5cd7c9320b7fa6b.tar.gz spark-eb4296c8f7561aaf8782479dd5cd7c9320b7fa6b.tar.bz2 spark-eb4296c8f7561aaf8782479dd5cd7c9320b7fa6b.zip