diff options
author | zsxwing <zsxwing@gmail.com> | 2015-08-24 23:34:50 -0700 |
---|---|---|
committer | Tathagata Das <tathagata.das1565@gmail.com> | 2015-08-24 23:34:50 -0700 |
commit | f023aa2fcc1d1dbb82aee568be0a8f2457c309ae (patch) | |
tree | c3074560d77c11d77522af81d9711cf53f0831a4 /sql | |
parent | d9c25dec87e6da7d66a47ff94e7eefa008081b9d (diff) | |
download | spark-f023aa2fcc1d1dbb82aee568be0a8f2457c309ae.tar.gz spark-f023aa2fcc1d1dbb82aee568be0a8f2457c309ae.tar.bz2 spark-f023aa2fcc1d1dbb82aee568be0a8f2457c309ae.zip |
[SPARK-10137] [STREAMING] Avoid to restart receivers if scheduleReceivers returns balanced results
This PR fixes the following cases for `ReceiverSchedulingPolicy`.
1) Assume there are 4 executors: host1, host2, host3, host4, and 5 receivers: r1, r2, r3, r4, r5. Then `ReceiverSchedulingPolicy.scheduleReceivers` will return (r1 -> host1, r2 -> host2, r3 -> host3, r4 -> host4, r5 -> host1).
Let's assume r1 starts at first on `host1` as `scheduleReceivers` suggested, and try to register with ReceiverTracker. But the previous `ReceiverSchedulingPolicy.rescheduleReceiver` will return (host2, host3, host4) according to the current executor weights (host1 -> 1.0, host2 -> 0.5, host3 -> 0.5, host4 -> 0.5), so ReceiverTracker will reject `r1`. This is unexpected since r1 is starting exactly where `scheduleReceivers` suggested.
This case can be fixed by ignoring the information of the receiver that is rescheduling in `receiverTrackingInfoMap`.
2) Assume there are 3 executors (host1, host2, host3) and each executors has 3 cores, and 3 receivers: r1, r2, r3. Assume r1 is running on host1. Now r2 is restarting, the previous `ReceiverSchedulingPolicy.rescheduleReceiver` will always return (host1, host2, host3). So it's possible that r2 will be scheduled to host1 by TaskScheduler. r3 is similar. Then at last, it's possible that there are 3 receivers running on host1, while host2 and host3 are idle.
This issue can be fixed by returning only executors that have the minimum wight rather than returning at least 3 executors.
Author: zsxwing <zsxwing@gmail.com>
Closes #8340 from zsxwing/fix-receiver-scheduling.
Diffstat (limited to 'sql')
0 files changed, 0 insertions, 0 deletions