aboutsummaryrefslogtreecommitdiff
path: root/core/src/main
diff options
context:
space:
mode:
authorjerryshao <sshao@hortonworks.com>2017-02-24 09:28:59 -0800
committerMarcelo Vanzin <vanzin@cloudera.com>2017-02-24 09:28:59 -0800
commitb0a8c16fecd4337f77bfbe4b45884254d7af52bd (patch)
treef9d5e4c4b73d6692c3aa49bca048518c2f52ee8f /core/src/main
parent4a5e38f5747148022988631cae0248ae1affadd3 (diff)
downloadspark-b0a8c16fecd4337f77bfbe4b45884254d7af52bd.tar.gz
spark-b0a8c16fecd4337f77bfbe4b45884254d7af52bd.tar.bz2
spark-b0a8c16fecd4337f77bfbe4b45884254d7af52bd.zip
[SPARK-19707][CORE] Improve the invalid path check for sc.addJar
## What changes were proposed in this pull request? Currently in Spark there're two issues when we add jars with invalid path: * If the jar path is a empty string {--jar ",dummy.jar"}, then Spark will resolve it to the current directory path and add to classpath / file server, which is unwanted. This is happened in our programatic way to submit Spark application. From my understanding Spark should defensively filter out such empty path. * If the jar path is a invalid path (file doesn't exist), `addJar` doesn't check it and will still add to file server, the exception will be delayed until job running. Actually this local path could be checked beforehand, no need to wait until task running. We have similar check in `addFile`, but lacks similar similar mechanism in `addJar`. ## How was this patch tested? Add unit test and local manual verification. Author: jerryshao <sshao@hortonworks.com> Closes #17038 from jerryshao/SPARK-19707.
Diffstat (limited to 'core/src/main')
-rw-r--r--core/src/main/scala/org/apache/spark/SparkContext.scala12
-rw-r--r--core/src/main/scala/org/apache/spark/util/Utils.scala2
2 files changed, 11 insertions, 3 deletions
diff --git a/core/src/main/scala/org/apache/spark/SparkContext.scala b/core/src/main/scala/org/apache/spark/SparkContext.scala
index 17194b9f06..0e36a30c93 100644
--- a/core/src/main/scala/org/apache/spark/SparkContext.scala
+++ b/core/src/main/scala/org/apache/spark/SparkContext.scala
@@ -1815,10 +1815,18 @@ class SparkContext(config: SparkConf) extends Logging {
// A JAR file which exists only on the driver node
case null | "file" =>
try {
+ val file = new File(uri.getPath)
+ if (!file.exists()) {
+ throw new FileNotFoundException(s"Jar ${file.getAbsolutePath} not found")
+ }
+ if (file.isDirectory) {
+ throw new IllegalArgumentException(
+ s"Directory ${file.getAbsoluteFile} is not allowed for addJar")
+ }
env.rpcEnv.fileServer.addJar(new File(uri.getPath))
} catch {
- case exc: FileNotFoundException =>
- logError(s"Jar not found at $path")
+ case NonFatal(e) =>
+ logError(s"Failed to add $path to Spark environment", e)
null
}
// A JAR file which exists locally on every worker node
diff --git a/core/src/main/scala/org/apache/spark/util/Utils.scala b/core/src/main/scala/org/apache/spark/util/Utils.scala
index 55382899a3..480240a93d 100644
--- a/core/src/main/scala/org/apache/spark/util/Utils.scala
+++ b/core/src/main/scala/org/apache/spark/util/Utils.scala
@@ -1989,7 +1989,7 @@ private[spark] object Utils extends Logging {
if (paths == null || paths.trim.isEmpty) {
""
} else {
- paths.split(",").map { p => Utils.resolveURI(p) }.mkString(",")
+ paths.split(",").filter(_.trim.nonEmpty).map { p => Utils.resolveURI(p) }.mkString(",")
}
}