diff options
author | hyukjinkwon <gurwls223@gmail.com> | 2016-12-08 23:02:05 +0800 |
---|---|---|
committer | Sean Owen <sowen@cloudera.com> | 2016-12-08 23:02:05 +0800 |
commit | 7f3c778fd0ab0298d71e756990051e7c073f9151 (patch) | |
tree | ef5042064036c43027566c7c9c3f8a5d42360242 /sql/core | |
parent | 9bf8f3cd4f62f921c32fb50b8abf49576a80874f (diff) | |
download | spark-7f3c778fd0ab0298d71e756990051e7c073f9151.tar.gz spark-7f3c778fd0ab0298d71e756990051e7c073f9151.tar.bz2 spark-7f3c778fd0ab0298d71e756990051e7c073f9151.zip |
[SPARK-18718][TESTS] Skip some test failures due to path length limitation and fix tests to pass on Windows
## What changes were proposed in this pull request?
There are some tests failed on Windows due to the wrong format of path and the limitation of path length as below:
This PR proposes both to fix the failed tests by fixing the path for the tests below:
- `InsertSuite`
```
Exception encountered when attempting to run a suite with class name: org.apache.spark.sql.sources.InsertSuite *** ABORTED *** (12 seconds, 547 milliseconds)
org.apache.spark.sql.AnalysisException: Path does not exist: file:/C:projectsspark arget mpspark-177945ef-9128-42b4-8c07-de31f78bbbd6;
at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$14.apply(DataSource.scala:382)
at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$14.apply(DataSource.scala:370)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
```
- `PathOptionSuite`
```
- path option also exist for write path *** FAILED *** (1 second, 93 milliseconds)
"C:[projectsspark arget mp]spark-5ab34a58-df8d-..." did not equal "C:[\projects\spark\target\tmp\]spark-5ab34a58-df8d-..." (PathOptionSuite.scala:93)
org.scalatest.exceptions.TestFailedException:
at org.scalatest.Assertions$class.newAssertionFailedException(Assertions.scala:500)
at org.scalatest.FunSuite.newAssertionFailedException(FunSuite.scala:1555)
...
```
- `UDFSuite`
```
- SPARK-8005 input_file_name *** FAILED *** (2 seconds, 234 milliseconds)
"file:///C:/projects/spark/target/tmp/spark-e4e5720a-2006-48f9-8b11-797bf59794bf/part-00001-26fb05e4-603d-471d-ae9d-b9549e0c7765.snappy.parquet" did not contain "C:\projects\spark\target\tmp\spark-e4e5720a-2006-48f9-8b11-797bf59794bf" (UDFSuite.scala:67)
org.scalatest.exceptions.TestFailedException:
at org.scalatest.Assertions$class.newAssertionFailedException(Assertions.scala:500)
at org.scalatest.FunSuite.newAssertionFailedException(FunSuite.scala:1555)
...
```
and to skip the tests belows which are being failed on Windows due to path length limitation.
- `SparkLauncherSuite`
```
Test org.apache.spark.launcher.SparkLauncherSuite.testChildProcLauncher failed: java.lang.AssertionError: expected:<0> but was:<1>, took 0.062 sec
at org.apache.spark.launcher.SparkLauncherSuite.testChildProcLauncher(SparkLauncherSuite.java:177)
...
```
The stderr from the process is `The filename or extension is too long` which is equivalent to the one below.
- `BroadcastJoinSuite`
```
04:09:40.882 ERROR org.apache.spark.deploy.worker.ExecutorRunner: Error running executor
java.io.IOException: Cannot run program "C:\Progra~1\Java\jdk1.8.0\bin\java" (in directory "C:\projects\spark\work\app-20161205040542-0000\51658"): CreateProcess error=206, The filename or extension is too long
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
at org.apache.spark.deploy.worker.ExecutorRunner.org$apache$spark$deploy$worker$ExecutorRunner$$fetchAndRunExecutor(ExecutorRunner.scala:167)
at org.apache.spark.deploy.worker.ExecutorRunner$$anon$1.run(ExecutorRunner.scala:73)
Caused by: java.io.IOException: CreateProcess error=206, The filename or extension is too long
at java.lang.ProcessImpl.create(Native Method)
at java.lang.ProcessImpl.<init>(ProcessImpl.java:386)
at java.lang.ProcessImpl.start(ProcessImpl.java:137)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
... 2 more
04:09:40.929 ERROR org.apache.spark.deploy.worker.ExecutorRunner: Error running executor
(appearently infinite same error messages)
...
```
## How was this patch tested?
Manually tested via AppVeyor.
**Before**
`InsertSuite`: https://ci.appveyor.com/project/spark-test/spark/build/148-InsertSuite-pr
`PathOptionSuite`: https://ci.appveyor.com/project/spark-test/spark/build/139-PathOptionSuite-pr
`UDFSuite`: https://ci.appveyor.com/project/spark-test/spark/build/143-UDFSuite-pr
`SparkLauncherSuite`: https://ci.appveyor.com/project/spark-test/spark/build/141-SparkLauncherSuite-pr
`BroadcastJoinSuite`: https://ci.appveyor.com/project/spark-test/spark/build/145-BroadcastJoinSuite-pr
**After**
`PathOptionSuite`: https://ci.appveyor.com/project/spark-test/spark/build/140-PathOptionSuite-pr
`SparkLauncherSuite`: https://ci.appveyor.com/project/spark-test/spark/build/142-SparkLauncherSuite-pr
`UDFSuite`: https://ci.appveyor.com/project/spark-test/spark/build/144-UDFSuite-pr
`InsertSuite`: https://ci.appveyor.com/project/spark-test/spark/build/147-InsertSuite-pr
`BroadcastJoinSuite`: https://ci.appveyor.com/project/spark-test/spark/build/149-BroadcastJoinSuite-pr
Author: hyukjinkwon <gurwls223@gmail.com>
Closes #16147 from HyukjinKwon/fix-tests.
Diffstat (limited to 'sql/core')
4 files changed, 21 insertions, 8 deletions
diff --git a/sql/core/src/test/scala/org/apache/spark/sql/UDFSuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/UDFSuite.scala index 547d3c1abe..e8ccefa69a 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/UDFSuite.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/UDFSuite.scala @@ -64,7 +64,7 @@ class UDFSuite extends QueryTest with SharedSQLContext { data.write.parquet(dir.getCanonicalPath) spark.read.parquet(dir.getCanonicalPath).createOrReplaceTempView("test_table") val answer = sql("select input_file_name() from test_table").head().getString(0) - assert(answer.contains(dir.getCanonicalPath)) + assert(answer.contains(dir.toURI.getPath)) assert(sql("select input_file_name() from test_table").distinct().collect().length >= 2) spark.catalog.dropTempView("test_table") } diff --git a/sql/core/src/test/scala/org/apache/spark/sql/execution/joins/BroadcastJoinSuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/execution/joins/BroadcastJoinSuite.scala index 83db81ea3f..7c4f763322 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/execution/joins/BroadcastJoinSuite.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/execution/joins/BroadcastJoinSuite.scala @@ -28,6 +28,7 @@ import org.apache.spark.sql.functions._ import org.apache.spark.sql.internal.SQLConf import org.apache.spark.sql.test.SQLTestUtils import org.apache.spark.sql.types.{LongType, ShortType} +import org.apache.spark.util.Utils /** * Test various broadcast join operators. @@ -85,31 +86,39 @@ class BroadcastJoinSuite extends QueryTest with SQLTestUtils { plan } + // This tests here are failed on Windows due to the failure of initiating executors + // by the path length limitation. See SPARK-18718. test("unsafe broadcast hash join updates peak execution memory") { + assume(!Utils.isWindows) testBroadcastJoinPeak[BroadcastHashJoinExec]("unsafe broadcast hash join", "inner") } test("unsafe broadcast hash outer join updates peak execution memory") { + assume(!Utils.isWindows) testBroadcastJoinPeak[BroadcastHashJoinExec]("unsafe broadcast hash outer join", "left_outer") } test("unsafe broadcast left semi join updates peak execution memory") { + assume(!Utils.isWindows) testBroadcastJoinPeak[BroadcastHashJoinExec]("unsafe broadcast left semi join", "leftsemi") } test("broadcast hint isn't bothered by authBroadcastJoinThreshold set to low values") { + assume(!Utils.isWindows) withSQLConf(SQLConf.AUTO_BROADCASTJOIN_THRESHOLD.key -> "0") { testBroadcastJoin[BroadcastHashJoinExec]("inner", true) } } test("broadcast hint isn't bothered by a disabled authBroadcastJoinThreshold") { + assume(!Utils.isWindows) withSQLConf(SQLConf.AUTO_BROADCASTJOIN_THRESHOLD.key -> "-1") { testBroadcastJoin[BroadcastHashJoinExec]("inner", true) } } test("broadcast hint isn't propagated after a join") { + assume(!Utils.isWindows) withSQLConf(SQLConf.AUTO_BROADCASTJOIN_THRESHOLD.key -> "-1") { val df1 = spark.createDataFrame(Seq((1, "4"), (2, "2"))).toDF("key", "value") val df2 = spark.createDataFrame(Seq((1, "1"), (2, "2"))).toDF("key", "value") @@ -137,6 +146,7 @@ class BroadcastJoinSuite extends QueryTest with SQLTestUtils { } test("broadcast hint is propagated correctly") { + assume(!Utils.isWindows) withSQLConf(SQLConf.AUTO_BROADCASTJOIN_THRESHOLD.key -> "-1") { val df2 = spark.createDataFrame(Seq((1, "1"), (2, "2"), (3, "2"))).toDF("key", "value") val broadcasted = broadcast(df2) @@ -157,6 +167,7 @@ class BroadcastJoinSuite extends QueryTest with SQLTestUtils { } test("join key rewritten") { + assume(!Utils.isWindows) val l = Literal(1L) val i = Literal(2) val s = Literal.create(3, ShortType) diff --git a/sql/core/src/test/scala/org/apache/spark/sql/sources/InsertSuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/sources/InsertSuite.scala index 4a85b5975e..13284ba649 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/sources/InsertSuite.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/sources/InsertSuite.scala @@ -20,7 +20,6 @@ package org.apache.spark.sql.sources import java.io.File import org.apache.spark.sql.{AnalysisException, Row} -import org.apache.spark.sql.catalyst.TableIdentifier import org.apache.spark.sql.test.SharedSQLContext import org.apache.spark.util.Utils @@ -38,7 +37,7 @@ class InsertSuite extends DataSourceTest with SharedSQLContext { |CREATE TEMPORARY TABLE jsonTable (a int, b string) |USING org.apache.spark.sql.json.DefaultSource |OPTIONS ( - | path '${path.toString}' + | path '${path.toURI.toString}' |) """.stripMargin) } diff --git a/sql/core/src/test/scala/org/apache/spark/sql/sources/PathOptionSuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/sources/PathOptionSuite.scala index bef47aacd3..faf9afc49a 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/sources/PathOptionSuite.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/sources/PathOptionSuite.scala @@ -17,6 +17,8 @@ package org.apache.spark.sql.sources +import org.apache.hadoop.fs.Path + import org.apache.spark.sql.{DataFrame, SaveMode, SparkSession, SQLContext} import org.apache.spark.sql.catalyst.TableIdentifier import org.apache.spark.sql.execution.datasources.LogicalRelation @@ -53,8 +55,8 @@ class TestOptionsRelation(val options: Map[String, String])(@transient val sessi // We can't get the relation directly for write path, here we put the path option in schema // metadata, so that we can test it later. override def schema: StructType = { - val metadataWithPath = pathOption.map { - path => new MetadataBuilder().putString("path", path).build() + val metadataWithPath = pathOption.map { path => + new MetadataBuilder().putString("path", path).build() } new StructType().add("i", IntegerType, true, metadataWithPath.getOrElse(Metadata.empty)) } @@ -82,15 +84,16 @@ class PathOptionSuite extends DataSourceTest with SharedSQLContext { test("path option also exist for write path") { withTable("src") { - withTempPath { path => + withTempPath { p => + val path = new Path(p.getAbsolutePath).toString sql( s""" |CREATE TABLE src |USING ${classOf[TestOptionsSource].getCanonicalName} - |OPTIONS (PATH '${path.getAbsolutePath}') + |OPTIONS (PATH '$path') |AS SELECT 1 """.stripMargin) - assert(spark.table("src").schema.head.metadata.getString("path") == path.getAbsolutePath) + assert(spark.table("src").schema.head.metadata.getString("path") == path) } } |