aboutsummaryrefslogtreecommitdiff
path: root/python/pyspark/rdd.py
diff options
context:
space:
mode:
authorHerman van Hovell <hvanhovell@databricks.com>2017-03-15 10:46:05 +0100
committerHerman van Hovell <hvanhovell@databricks.com>2017-03-15 10:46:05 +0100
commit9ff85be3bd6bf3a782c0e52fa9c2598d79f310bb (patch)
tree7824feab4c6da92cec27d0e1c3e5f2aba5c9fffd /python/pyspark/rdd.py
parentee36bc1c9043ead3c3ba4fba7e68c6c47ad7ae7a (diff)
downloadspark-9ff85be3bd6bf3a782c0e52fa9c2598d79f310bb.tar.gz
spark-9ff85be3bd6bf3a782c0e52fa9c2598d79f310bb.tar.bz2
spark-9ff85be3bd6bf3a782c0e52fa9c2598d79f310bb.zip
[SPARK-19889][SQL] Make TaskContext callbacks thread safe
## What changes were proposed in this pull request? It is sometimes useful to use multiple threads in a task to parallelize tasks. These threads might register some completion/failure listeners to clean up when the task completes or fails. We currently cannot register such a callback and be sure that it will get called, because the context might be in the process of invoking its callbacks, when the the callback gets registered. This PR improves this by making sure that you cannot add a completion/failure listener from a different thread when the context is being marked as completed/failed in another thread. This is done by synchronizing these methods on the task context itself. Failure listeners were called only once. Completion listeners now follow the same pattern; this lifts the idempotency requirement for completion listeners and makes it easier to implement them. In some cases we can (accidentally) add a completion/failure listener after the fact, these listeners will be called immediately in order make sure we can safely clean-up after a task. As a result of this change we could make the `failure` and `completed` flags non-volatile. The `isCompleted()` method now uses synchronization to ensure that updates are visible across threads. ## How was this patch tested? Adding tests to `TaskContestSuite` to test adding listeners to a completed/failed context. Author: Herman van Hovell <hvanhovell@databricks.com> Closes #17244 from hvanhovell/SPARK-19889.
Diffstat (limited to 'python/pyspark/rdd.py')
0 files changed, 0 insertions, 0 deletions