[SPARK-7989] [SPARK-10651] [CORE] [TESTS] Increase timeout to fix flaky tests - spark

diff options

author	zsxwing <zsxwing@gmail.com>	2015-09-21 11:39:04 -0700
committer	Xiangrui Meng <meng@databricks.com>	2015-09-21 11:39:04 -0700
commit	ebbf85f07bb8de0d566f1ae4b41f26421180bebe (patch)
tree	63e9adc9220c2920970bf8ad0b78d1487c7ba0e2 /docs/running-on-yarn.md
parent	20a61dbd9b57957fcc5b58ef8935533914172b07 (diff)
download	spark-ebbf85f07bb8de0d566f1ae4b41f26421180bebe.tar.gz spark-ebbf85f07bb8de0d566f1ae4b41f26421180bebe.tar.bz2 spark-ebbf85f07bb8de0d566f1ae4b41f26421180bebe.zip

[SPARK-7989] [SPARK-10651] [CORE] [TESTS] Increase timeout to fix flaky tests

I noticed only one block manager registered with master in an unsuccessful build (https://amplab.cs.berkeley.edu/jenkins/job/Spark-Master-SBT/AMPLAB_JENKINS_BUILD_PROFILE=hadoop2.2,label=spark-test/3534/) ``` 15/09/16 13:02:30.981 pool-1-thread-1-ScalaTest-running-BroadcastSuite INFO SparkContext: Running Spark version 1.6.0-SNAPSHOT ... 15/09/16 13:02:38.133 sparkDriver-akka.actor.default-dispatcher-19 INFO BlockManagerMasterEndpoint: Registering block manager localhost:48196 with 530.3 MB RAM, BlockManagerId(0, localhost, 48196) ``` In addition, the first block manager needed 7+ seconds to start. But the test expected 2 block managers so it failed. However, there was no exception in this log file. So I checked a successful build (https://amplab.cs.berkeley.edu/jenkins/job/Spark-Master-SBT/3536/AMPLAB_JENKINS_BUILD_PROFILE=hadoop2.2,label=spark-test/) and it needed 4-5 seconds to set up the local cluster: ``` 15/09/16 18:11:27.738 sparkWorker1-akka.actor.default-dispatcher-5 INFO Worker: Running Spark version 1.6.0-SNAPSHOT ... 15/09/16 18:11:30.838 sparkDriver-akka.actor.default-dispatcher-20 INFO BlockManagerMasterEndpoint: Registering block manager localhost:54202 with 530.3 MB RAM, BlockManagerId(1, localhost, 54202) 15/09/16 18:11:32.112 sparkDriver-akka.actor.default-dispatcher-20 INFO BlockManagerMasterEndpoint: Registering block manager localhost:32955 with 530.3 MB RAM, BlockManagerId(0, localhost, 32955) ``` In this build, the first block manager needed only 3+ seconds to start. Comparing these two builds, I guess it's possible that the local cluster in `BroadcastSuite` cannot be ready in 10 seconds if the Jenkins worker is busy. So I just increased the timeout to 60 seconds to see if this can fix the issue. Author: zsxwing <zsxwing@gmail.com> Closes #8813 from zsxwing/fix-BroadcastSuite.

Diffstat (limited to 'docs/running-on-yarn.md')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: