aboutsummaryrefslogtreecommitdiff
path: root/project
diff options
context:
space:
mode:
authorTal Sliwowicz <tal.s@taboola.com>2014-10-23 10:51:06 -0700
committerAndrew Or <andrewor14@gmail.com>2014-10-23 10:53:53 -0700
commit6b485225271a3c616c4fa1231c20090a95c86f32 (patch)
tree8bc5d548c5bdac2217ace4938485e576334982fd /project
parentf799700eec4a5e33db9b2d6a4bee60a50fd5a099 (diff)
downloadspark-6b485225271a3c616c4fa1231c20090a95c86f32.tar.gz
spark-6b485225271a3c616c4fa1231c20090a95c86f32.tar.bz2
spark-6b485225271a3c616c4fa1231c20090a95c86f32.zip
[SPARK-4006] In long running contexts, we encountered the situation of double registe...
...r without a remove in between. The cause for that is unknown, and assumed a temp network issue. However, since the second register is with a BlockManagerId on a different port, blockManagerInfo.contains() returns false, while blockManagerIdByExecutor returns Some. This inconsistency is caught in a conditional statement that does System.exit(1), which is a huge robustness issue for us. The fix - simply remove the old id from both maps during register when this happens. We are mimicking the behavior of expireDeadHosts(), by doing local cleanup of the maps before trying to add new ones. Also - added some logging for register and unregister. This is just like https://github.com/apache/spark/pull/2854 except it's on master Author: Tal Sliwowicz <tal.s@taboola.com> Closes #2886 from tsliwowicz/master-block-mgr-removal and squashes the following commits: 094d508 [Tal Sliwowicz] some more white space change undone 41a2217 [Tal Sliwowicz] some more whitspaces change undone 7bcfc3d [Tal Sliwowicz] whitspaces fix df9d98f [Tal Sliwowicz] Code review comments fixed f48bce9 [Tal Sliwowicz] In long running contexts, we encountered the situation of double register without a remove in between. The cause for that is unknown, and assumed a temp network issue.
Diffstat (limited to 'project')
0 files changed, 0 insertions, 0 deletions