diff options
author | hyukjinkwon <gurwls223@gmail.com> | 2017-01-10 13:22:35 +0000 |
---|---|---|
committer | Sean Owen <sowen@cloudera.com> | 2017-01-10 13:22:35 +0000 |
commit | 2cfd41ac02193aaf121afcddcb6383f4d075ea1e (patch) | |
tree | d954e87718305ca2ac313ddbb397110228e9e83e /core | |
parent | 4e27578faa67c7a71a9b938aafbaf79bdbf36831 (diff) | |
download | spark-2cfd41ac02193aaf121afcddcb6383f4d075ea1e.tar.gz spark-2cfd41ac02193aaf121afcddcb6383f4d075ea1e.tar.bz2 spark-2cfd41ac02193aaf121afcddcb6383f4d075ea1e.zip |
[SPARK-19117][TESTS] Skip the tests using script transformation on Windows
## What changes were proposed in this pull request?
This PR proposes to skip the tests for script transformation failed on Windows due to fixed bash location.
```
SQLQuerySuite:
- script *** FAILED *** (553 milliseconds)
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 56.0 failed 1 times, most recent failure: Lost task 0.0 in stage 56.0 (TID 54, localhost, executor driver): java.io.IOException: Cannot run program "/bin/bash": CreateProcess error=2, The system cannot find the file specified
- Star Expansion - script transform *** FAILED *** (2 seconds, 375 milliseconds)
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 389.0 failed 1 times, most recent failure: Lost task 0.0 in stage 389.0 (TID 725, localhost, executor driver): java.io.IOException: Cannot run program "/bin/bash": CreateProcess error=2, The system cannot find the file specified
- test script transform for stdout *** FAILED *** (2 seconds, 813 milliseconds)
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 391.0 failed 1 times, most recent failure: Lost task 0.0 in stage 391.0 (TID 726, localhost, executor driver): java.io.IOException: Cannot run program "/bin/bash": CreateProcess error=2, The system cannot find the file specified
- test script transform for stderr *** FAILED *** (2 seconds, 407 milliseconds)
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 393.0 failed 1 times, most recent failure: Lost task 0.0 in stage 393.0 (TID 727, localhost, executor driver): java.io.IOException: Cannot run program "/bin/bash": CreateProcess error=2, The system cannot find the file specified
- test script transform data type *** FAILED *** (171 milliseconds)
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 395.0 failed 1 times, most recent failure: Lost task 0.0 in stage 395.0 (TID 728, localhost, executor driver): java.io.IOException: Cannot run program "/bin/bash": CreateProcess error=2, The system cannot find the file specified
```
```
HiveQuerySuite:
- transform *** FAILED *** (359 milliseconds)
Failed to execute query using catalyst:
Error: Job aborted due to stage failure: Task 0 in stage 1347.0 failed 1 times, most recent failure: Lost task 0.0 in stage 1347.0 (TID 2395, localhost, executor driver): java.io.IOException: Cannot run program "/bin/bash": CreateProcess error=2, The system cannot find the file specified
- schema-less transform *** FAILED *** (344 milliseconds)
Failed to execute query using catalyst:
Error: Job aborted due to stage failure: Task 0 in stage 1348.0 failed 1 times, most recent failure: Lost task 0.0 in stage 1348.0 (TID 2396, localhost, executor driver): java.io.IOException: Cannot run program "/bin/bash": CreateProcess error=2, The system cannot find the file specified
- transform with custom field delimiter *** FAILED *** (296 milliseconds)
Failed to execute query using catalyst:
Error: Job aborted due to stage failure: Task 0 in stage 1349.0 failed 1 times, most recent failure: Lost task 0.0 in stage 1349.0 (TID 2397, localhost, executor driver): java.io.IOException: Cannot run program "/bin/bash": CreateProcess error=2, The system cannot find the file specified
- transform with custom field delimiter2 *** FAILED *** (297 milliseconds)
Failed to execute query using catalyst:
Error: Job aborted due to stage failure: Task 0 in stage 1350.0 failed 1 times, most recent failure: Lost task 0.0 in stage 1350.0 (TID 2398, localhost, executor driver): java.io.IOException: Cannot run program "/bin/bash": CreateProcess error=2, The system cannot find the file specified
- transform with custom field delimiter3 *** FAILED *** (312 milliseconds)
Failed to execute query using catalyst:
Error: Job aborted due to stage failure: Task 0 in stage 1351.0 failed 1 times, most recent failure: Lost task 0.0 in stage 1351.0 (TID 2399, localhost, executor driver): java.io.IOException: Cannot run program "/bin/bash": CreateProcess error=2, The system cannot find the file specified
- transform with SerDe2 *** FAILED *** (437 milliseconds)
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1355.0 failed 1 times, most recent failure: Lost task 0.0 in stage 1355.0 (TID 2403, localhost, executor driver): java.io.IOException: Cannot run program "/bin/bash": CreateProcess error=2, The system cannot find the file specified
```
```
LogicalPlanToSQLSuite:
- script transformation - schemaless *** FAILED *** (78 milliseconds)
...
Cause: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1968.0 failed 1 times, most recent failure: Lost task 0.0 in stage 1968.0 (TID 3932, localhost, executor driver): java.io.IOException: Cannot run program "/bin/bash": CreateProcess error=2, The system cannot find the file specified
- script transformation - alias list *** FAILED *** (94 milliseconds)
...
Cause: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1969.0 failed 1 times, most recent failure: Lost task 0.0 in stage 1969.0 (TID 3933, localhost, executor driver): java.io.IOException: Cannot run program "/bin/bash": CreateProcess error=2, The system cannot find the file specified
- script transformation - alias list with type *** FAILED *** (93 milliseconds)
...
Cause: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1970.0 failed 1 times, most recent failure: Lost task 0.0 in stage 1970.0 (TID 3934, localhost, executor driver): java.io.IOException: Cannot run program "/bin/bash": CreateProcess error=2, The system cannot find the file specified
- script transformation - row format delimited clause with only one format property *** FAILED *** (78 milliseconds)
...
Cause: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1971.0 failed 1 times, most recent failure: Lost task 0.0 in stage 1971.0 (TID 3935, localhost, executor driver): java.io.IOException: Cannot run program "/bin/bash": CreateProcess error=2, The system cannot find the file specified
- script transformation - row format delimited clause with multiple format properties *** FAILED *** (94 milliseconds)
...
Cause: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1972.0 failed 1 times, most recent failure: Lost task 0.0 in stage 1972.0 (TID 3936, localhost, executor driver): java.io.IOException: Cannot run program "/bin/bash": CreateProcess error=2, The system cannot find the file specified
- script transformation - row format serde clauses with SERDEPROPERTIES *** FAILED *** (78 milliseconds)
...
Cause: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1973.0 failed 1 times, most recent failure: Lost task 0.0 in stage 1973.0 (TID 3937, localhost, executor driver): java.io.IOException: Cannot run program "/bin/bash": CreateProcess error=2, The system cannot find the file specified
- script transformation - row format serde clauses without SERDEPROPERTIES *** FAILED *** (78 milliseconds)
...
Cause: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1974.0 failed 1 times, most recent failure: Lost task 0.0 in stage 1974.0 (TID 3938, localhost, executor driver): java.io.IOException: Cannot run program "/bin/bash": CreateProcess error=2, The system cannot find the file specified
```
```
ScriptTransformationSuite:
- cat without SerDe *** FAILED *** (156 milliseconds)
...
Caused by: java.io.IOException: Cannot run program "/bin/bash": CreateProcess error=2, The system cannot find the file specified
- cat with LazySimpleSerDe *** FAILED *** (63 milliseconds)
...
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 2383.0 failed 1 times, most recent failure: Lost task 0.0 in stage 2383.0 (TID 4819, localhost, executor driver): java.io.IOException: Cannot run program "/bin/bash": CreateProcess error=2, The system cannot find the file specified
- script transformation should not swallow errors from upstream operators (no serde) *** FAILED *** (78 milliseconds)
...
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 2384.0 failed 1 times, most recent failure: Lost task 0.0 in stage 2384.0 (TID 4820, localhost, executor driver): java.io.IOException: Cannot run program "/bin/bash": CreateProcess error=2, The system cannot find the file specified
- script transformation should not swallow errors from upstream operators (with serde) *** FAILED *** (47 milliseconds)
...
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 2385.0 failed 1 times, most recent failure: Lost task 0.0 in stage 2385.0 (TID 4821, localhost, executor driver): java.io.IOException: Cannot run program "/bin/bash": CreateProcess error=2, The system cannot find the file specified
- SPARK-14400 script transformation should fail for bad script command *** FAILED *** (47 milliseconds)
"Job aborted due to stage failure: Task 0 in stage 2386.0 failed 1 times, most recent failure: Lost task 0.0 in stage 2386.0 (TID 4822, localhost, executor driver): java.io.IOException: Cannot run program "/bin/bash": CreateProcess error=2, The system cannot find the file specified
```
## How was this patch tested?
AppVeyor as below:
```
SQLQuerySuite:
- script !!! CANCELED !!! (63 milliseconds)
- Star Expansion - script transform !!! CANCELED !!! (0 milliseconds)
- test script transform for stdout !!! CANCELED !!! (0 milliseconds)
- test script transform for stderr !!! CANCELED !!! (0 milliseconds)
- test script transform data type !!! CANCELED !!! (0 milliseconds)
```
```
HiveQuerySuite:
- transform !!! CANCELED !!! (31 milliseconds)
- schema-less transform !!! CANCELED !!! (0 milliseconds)
- transform with custom field delimiter !!! CANCELED !!! (0 milliseconds)
- transform with custom field delimiter2 !!! CANCELED !!! (0 milliseconds)
- transform with custom field delimiter3 !!! CANCELED !!! (0 milliseconds)
- transform with SerDe2 !!! CANCELED !!! (0 milliseconds)
```
```
LogicalPlanToSQLSuite:
- script transformation - schemaless !!! CANCELED !!! (78 milliseconds)
- script transformation - alias list !!! CANCELED !!! (0 milliseconds)
- script transformation - alias list with type !!! CANCELED !!! (0 milliseconds)
- script transformation - row format delimited clause with only one format property !!! CANCELED !!! (15 milliseconds)
- script transformation - row format delimited clause with multiple format properties !!! CANCELED !!! (0 milliseconds)
- script transformation - row format serde clauses with SERDEPROPERTIES !!! CANCELED !!! (0 milliseconds)
- script transformation - row format serde clauses without SERDEPROPERTIES !!! CANCELED !!! (0 milliseconds)
```
```
ScriptTransformationSuite:
- cat without SerDe !!! CANCELED !!! (62 milliseconds)
- cat with LazySimpleSerDe !!! CANCELED !!! (0 milliseconds)
- script transformation should not swallow errors from upstream operators (no serde) !!! CANCELED !!! (0 milliseconds)
- script transformation should not swallow errors from upstream operators (with serde) !!! CANCELED !!! (0 milliseconds)
- SPARK-14400 script transformation should fail for bad script command !!! CANCELED !!! (0 milliseconds)
```
Jenkins tests
Author: hyukjinkwon <gurwls223@gmail.com>
Closes #16501 from HyukjinKwon/windows-bash.
Diffstat (limited to 'core')
-rw-r--r-- | core/src/main/scala/org/apache/spark/TestUtils.scala | 11 | ||||
-rw-r--r-- | core/src/test/scala/org/apache/spark/rdd/PipedRDDSuite.scala | 25 |
2 files changed, 19 insertions, 17 deletions
diff --git a/core/src/main/scala/org/apache/spark/TestUtils.scala b/core/src/main/scala/org/apache/spark/TestUtils.scala index b5b201409a..fd0477541e 100644 --- a/core/src/main/scala/org/apache/spark/TestUtils.scala +++ b/core/src/main/scala/org/apache/spark/TestUtils.scala @@ -20,7 +20,6 @@ package org.apache.spark import java.io.{ByteArrayInputStream, File, FileInputStream, FileOutputStream} import java.net.{URI, URL} import java.nio.charset.StandardCharsets -import java.nio.file.Paths import java.util.Arrays import java.util.concurrent.{CountDownLatch, TimeUnit} import java.util.jar.{JarEntry, JarOutputStream} @@ -28,6 +27,8 @@ import java.util.jar.{JarEntry, JarOutputStream} import scala.collection.JavaConverters._ import scala.collection.mutable import scala.collection.mutable.ArrayBuffer +import scala.sys.process.{Process, ProcessLogger} +import scala.util.Try import com.google.common.io.{ByteStreams, Files} import javax.tools.{JavaFileObject, SimpleJavaFileObject, ToolProvider} @@ -185,6 +186,14 @@ private[spark] object TestUtils { assert(spillListener.numSpilledStages == 0, s"expected $identifier to not spill, but did") } + /** + * Test if a command is available. + */ + def testCommandAvailable(command: String): Boolean = { + val attempt = Try(Process(command).run(ProcessLogger(_ => ())).exitValue()) + attempt.isSuccess && attempt.get == 0 + } + } diff --git a/core/src/test/scala/org/apache/spark/rdd/PipedRDDSuite.scala b/core/src/test/scala/org/apache/spark/rdd/PipedRDDSuite.scala index 287ae6ff6e..1a0eb250e7 100644 --- a/core/src/test/scala/org/apache/spark/rdd/PipedRDDSuite.scala +++ b/core/src/test/scala/org/apache/spark/rdd/PipedRDDSuite.scala @@ -21,8 +21,6 @@ import java.io.File import scala.collection.Map import scala.io.Codec -import scala.sys.process._ -import scala.util.Try import org.apache.hadoop.fs.Path import org.apache.hadoop.io.{LongWritable, Text} @@ -39,7 +37,7 @@ class PipedRDDSuite extends SparkFunSuite with SharedSparkContext { } test("basic pipe") { - assume(testCommandAvailable("cat")) + assume(TestUtils.testCommandAvailable("cat")) val nums = sc.makeRDD(Array(1, 2, 3, 4), 2) val piped = nums.pipe(Seq("cat")) @@ -53,7 +51,7 @@ class PipedRDDSuite extends SparkFunSuite with SharedSparkContext { } test("basic pipe with tokenization") { - assume(testCommandAvailable("wc")) + assume(TestUtils.testCommandAvailable("wc")) val nums = sc.makeRDD(Array(1, 2, 3, 4), 2) // verify that both RDD.pipe(command: String) and RDD.pipe(command: String, env) work good @@ -66,7 +64,7 @@ class PipedRDDSuite extends SparkFunSuite with SharedSparkContext { } test("failure in iterating over pipe input") { - assume(testCommandAvailable("cat")) + assume(TestUtils.testCommandAvailable("cat")) val nums = sc.makeRDD(Array(1, 2, 3, 4), 2) .mapPartitionsWithIndex((index, iterator) => { @@ -86,7 +84,7 @@ class PipedRDDSuite extends SparkFunSuite with SharedSparkContext { } test("advanced pipe") { - assume(testCommandAvailable("cat")) + assume(TestUtils.testCommandAvailable("cat")) val nums = sc.makeRDD(Array(1, 2, 3, 4), 2) val bl = sc.broadcast(List("0")) @@ -147,7 +145,7 @@ class PipedRDDSuite extends SparkFunSuite with SharedSparkContext { } test("pipe with env variable") { - assume(testCommandAvailable(envCommand)) + assume(TestUtils.testCommandAvailable(envCommand)) val nums = sc.makeRDD(Array(1, 2, 3, 4), 2) val piped = nums.pipe(s"$envCommand MY_TEST_ENV", Map("MY_TEST_ENV" -> "LALALA")) val c = piped.collect() @@ -159,7 +157,7 @@ class PipedRDDSuite extends SparkFunSuite with SharedSparkContext { } test("pipe with process which cannot be launched due to bad command") { - assume(!testCommandAvailable("some_nonexistent_command")) + assume(!TestUtils.testCommandAvailable("some_nonexistent_command")) val nums = sc.makeRDD(Array(1, 2, 3, 4), 2) val command = Seq("some_nonexistent_command") val piped = nums.pipe(command) @@ -170,7 +168,7 @@ class PipedRDDSuite extends SparkFunSuite with SharedSparkContext { } test("pipe with process which is launched but fails with non-zero exit status") { - assume(testCommandAvailable("cat")) + assume(TestUtils.testCommandAvailable("cat")) val nums = sc.makeRDD(Array(1, 2, 3, 4), 2) val command = Seq("cat", "nonexistent_file") val piped = nums.pipe(command) @@ -181,7 +179,7 @@ class PipedRDDSuite extends SparkFunSuite with SharedSparkContext { } test("basic pipe with separate working directory") { - assume(testCommandAvailable("cat")) + assume(TestUtils.testCommandAvailable("cat")) val nums = sc.makeRDD(Array(1, 2, 3, 4), 2) val piped = nums.pipe(Seq("cat"), separateWorkingDir = true) val c = piped.collect() @@ -208,13 +206,8 @@ class PipedRDDSuite extends SparkFunSuite with SharedSparkContext { testExportInputFile("mapreduce_map_input_file") } - def testCommandAvailable(command: String): Boolean = { - val attempt = Try(Process(command).run(ProcessLogger(_ => ())).exitValue()) - attempt.isSuccess && attempt.get == 0 - } - def testExportInputFile(varName: String) { - assume(testCommandAvailable(envCommand)) + assume(TestUtils.testCommandAvailable(envCommand)) val nums = new HadoopRDD(sc, new JobConf(), classOf[TextInputFormat], classOf[LongWritable], classOf[Text], 2) { override def getPartitions: Array[Partition] = Array(generateFakeHadoopPartition()) |