[SPARK-18842][TESTS][LAUNCHER] De-duplicate paths in classpaths in commands for local-cluster mode to work around the path length limitation on Windows

## What changes were proposed in this pull request? Currently, some tests are being failed and hanging on Windows due to this problem. For the reason in SPARK-18718, some tests using `local-cluster` mode were disabled on Windows due to the length limitation by paths given to classpaths. The limitation seems roughly 32K (see the [blog in MS](https://blogs.msdn.microsoft.com/oldnewthing/20031210-00/?p=41553/) and [another reference](https://support.thoughtworks.com/hc/en-us/articles/213248526-Getting-around-maximum-command-line-length-is-32767-characters-on-Windows)) but in `local-cluster` mode, executors were being launched as processes with the command such as [here](https://gist.github.com/HyukjinKwon/5bc81061c250d4af5a180869b59d42ea) in (only) tests. This length is roughly 40K due to the classpaths given to `java` command. However, it seems duplicates are almost half of them. So, if we deduplicate the paths, it seems reduced to roughly 20K with the command, [here](https://gist.github.com/HyukjinKwon/dad0c8db897e5e094684a2dc6a417790). Maybe, we should consider as some more paths are added in the future but it seems better than disabling all the tests for now with minimised changes. Therefore, this PR proposes to deduplicate the paths in classpaths in case of launching executors as processes in `local-cluster` mode. ## How was this patch tested? Existing tests in `ShuffleSuite` and `BroadcastJoinSuite` manually via AppVeyor Author: hyukjinkwon <gurwls223@gmail.com> Closes #16266 from HyukjinKwon/disable-local-cluster-tests.
author: hyukjinkwon <gurwls223@gmail.com> 2016-12-14 19:24:24 +0000
committer: Sean Owen <sowen@cloudera.com> 2016-12-14 19:24:24 +0000
commit: c6b8eb71a9638c9a8ce02d11d5fe26f4c5be531e (patch)
tree: 08466a3d1154edbc97d6be688e30c15130b05c8f /sql
parent: ba4aab9b85688141d3d0c185165ec7a402c9fbba (diff)
download: spark-c6b8eb71a9638c9a8ce02d11d5fe26f4c5be531e.tar.gz
spark-c6b8eb71a9638c9a8ce02d11d5fe26f4c5be531e.tar.bz2
spark-c6b8eb71a9638c9a8ce02d11d5fe26f4c5be531e.zip
1 files changed, 0 insertions, 10 deletions
diff --git a/sql/core/src/test/scala/org/apache/spark/sql/execution/joins/BroadcastJoinSuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/execution/joins/BroadcastJoinSuite.scala
index 07839359a0..119d6e25df 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/execution/joins/BroadcastJoinSuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/execution/joins/BroadcastJoinSuite.scala
@@ -86,39 +86,31 @@ class BroadcastJoinSuite extends QueryTest with SQLTestUtils {
     plan
   }
 
-  // The tests here are failed on Windows due to the failure of initiating executors
-  // by the path length limitation. See SPARK-18718.
   test("unsafe broadcast hash join updates peak execution memory") {
-    assume(!Utils.isWindows)
     testBroadcastJoinPeak[BroadcastHashJoinExec]("unsafe broadcast hash join", "inner")
   }
 
   test("unsafe broadcast hash outer join updates peak execution memory") {
-    assume(!Utils.isWindows)
     testBroadcastJoinPeak[BroadcastHashJoinExec]("unsafe broadcast hash outer join", "left_outer")
   }
 
   test("unsafe broadcast left semi join updates peak execution memory") {
-    assume(!Utils.isWindows)
     testBroadcastJoinPeak[BroadcastHashJoinExec]("unsafe broadcast left semi join", "leftsemi")
   }
 
   test("broadcast hint isn't bothered by authBroadcastJoinThreshold set to low values") {
-    assume(!Utils.isWindows)
     withSQLConf(SQLConf.AUTO_BROADCASTJOIN_THRESHOLD.key -> "0") {
       testBroadcastJoin[BroadcastHashJoinExec]("inner", true)
     }
   }
 
   test("broadcast hint isn't bothered by a disabled authBroadcastJoinThreshold") {
-    assume(!Utils.isWindows)
     withSQLConf(SQLConf.AUTO_BROADCASTJOIN_THRESHOLD.key -> "-1") {
       testBroadcastJoin[BroadcastHashJoinExec]("inner", true)
     }
   }
 
   test("broadcast hint isn't propagated after a join") {
-    assume(!Utils.isWindows)
     withSQLConf(SQLConf.AUTO_BROADCASTJOIN_THRESHOLD.key -> "-1") {
       val df1 = spark.createDataFrame(Seq((1, "4"), (2, "2"))).toDF("key", "value")
       val df2 = spark.createDataFrame(Seq((1, "1"), (2, "2"))).toDF("key", "value")
@@ -146,7 +138,6 @@ class BroadcastJoinSuite extends QueryTest with SQLTestUtils {
   }
 
   test("broadcast hint is propagated correctly") {
-    assume(!Utils.isWindows)
     withSQLConf(SQLConf.AUTO_BROADCASTJOIN_THRESHOLD.key -> "-1") {
       val df2 = spark.createDataFrame(Seq((1, "1"), (2, "2"), (3, "2"))).toDF("key", "value")
       val broadcasted = broadcast(df2)
@@ -167,7 +158,6 @@ class BroadcastJoinSuite extends QueryTest with SQLTestUtils {
   }
 
   test("join key rewritten") {
-    assume(!Utils.isWindows)
     val l = Literal(1L)
     val i = Literal(2)
     val s = Literal.create(3, ShortType)
author	hyukjinkwon <gurwls223@gmail.com>	2016-12-14 19:24:24 +0000
committer	Sean Owen <sowen@cloudera.com>	2016-12-14 19:24:24 +0000
commit	c6b8eb71a9638c9a8ce02d11d5fe26f4c5be531e (patch)
tree	08466a3d1154edbc97d6be688e30c15130b05c8f /sql
parent	ba4aab9b85688141d3d0c185165ec7a402c9fbba (diff)
download	spark-c6b8eb71a9638c9a8ce02d11d5fe26f4c5be531e.tar.gz spark-c6b8eb71a9638c9a8ce02d11d5fe26f4c5be531e.tar.bz2 spark-c6b8eb71a9638c9a8ce02d11d5fe26f4c5be531e.zip