diff options
author | Marcelo Vanzin <vanzin@cloudera.com> | 2015-03-18 09:18:28 -0400 |
---|---|---|
committer | Sean Owen <sowen@cloudera.com> | 2015-03-18 09:18:28 -0400 |
commit | 981fbafa2a878e86abeefe1d77cca01fd848f9f6 (patch) | |
tree | dfbe80ea4bf639ee9d646d97af003469b8e8e4b7 /yarn/src/test | |
parent | 9d112a958ee2facad179344dd367a6d1ccbc9614 (diff) | |
download | spark-981fbafa2a878e86abeefe1d77cca01fd848f9f6.tar.gz spark-981fbafa2a878e86abeefe1d77cca01fd848f9f6.tar.bz2 spark-981fbafa2a878e86abeefe1d77cca01fd848f9f6.zip |
[SPARK-6325] [core,yarn] Do not change target executor count when killing executors.
The dynamic execution code has two ways to reduce the number of executors: one
where it reduces the total number of executors it wants, by asking for an absolute
number of executors that is lower than the previous one. The second is by
explicitly killing idle executors.
YarnAllocator was mixing those up and lowering the target number of executors
when a kill was issued. Instead, trust the frontend knows what it's doing, and kill
executors without messing with other accounting. That means that if the frontend
kills an executor without lowering the target, it will get a new executor shortly.
The one situation where both actions (lower the target and kill executor) need to
happen together is when user code explicitly calls `SparkContext.killExecutors`.
In that case, issue two calls to the backend to achieve the goal.
I also did some minor cleanup in related code:
- avoid sending a request for executors when target is unchanged, to avoid log
spam in the AM
- avoid printing misleading log messages in the AM when there are no requests
to cancel
- fix a slow memory leak plus misleading error message on the driver caused by
failing to completely unregister the executor.
Author: Marcelo Vanzin <vanzin@cloudera.com>
Closes #5018 from vanzin/SPARK-6325 and squashes the following commits:
2e782a3 [Marcelo Vanzin] Avoid redundant logging on the AM side.
a3567cd [Marcelo Vanzin] Add parentheses.
a363926 [Marcelo Vanzin] Update logic.
a158101 [Marcelo Vanzin] [SPARK-6325] [core,yarn] Disallow reducing executor count past running count.
Diffstat (limited to 'yarn/src/test')
-rw-r--r-- | yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnAllocatorSuite.scala | 22 |
1 files changed, 22 insertions, 0 deletions
diff --git a/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnAllocatorSuite.scala b/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnAllocatorSuite.scala index 3c224f1488..c09b01bafc 100644 --- a/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnAllocatorSuite.scala +++ b/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnAllocatorSuite.scala @@ -206,6 +206,28 @@ class YarnAllocatorSuite extends FunSuite with Matchers with BeforeAndAfterEach handler.getNumExecutorsRunning should be (2) } + test("kill executors") { + val handler = createAllocator(4) + handler.updateResourceRequests() + handler.getNumExecutorsRunning should be (0) + handler.getNumPendingAllocate should be (4) + + val container1 = createContainer("host1") + val container2 = createContainer("host2") + handler.handleAllocatedContainers(Array(container1, container2)) + + handler.requestTotalExecutors(1) + handler.executorIdToContainer.keys.foreach { id => handler.killExecutor(id ) } + + val statuses = Seq(container1, container2).map { c => + ContainerStatus.newInstance(c.getId(), ContainerState.COMPLETE, "Finished", 0) + } + handler.updateResourceRequests() + handler.processCompletedContainers(statuses.toSeq) + handler.getNumExecutorsRunning should be (0) + handler.getNumPendingAllocate should be (1) + } + test("memory exceeded diagnostic regexes") { val diagnostics = "Container [pid=12465,containerID=container_1412887393566_0003_01_000002] is running " + |