aboutsummaryrefslogtreecommitdiff
path: root/python/pyspark/__init__.py
diff options
context:
space:
mode:
authorBryan Cutler <cutlerb@gmail.com>2017-03-03 16:43:45 -0800
committerJoseph K. Bradley <joseph@databricks.com>2017-03-03 16:43:45 -0800
commit44281ca81d4eda02b627ba21841108438b7d1c27 (patch)
tree4125cfa2e8dd98e247ae7240d88f3845ce871734 /python/pyspark/__init__.py
parent2a7921a813ecd847fd933ffef10edc64684e9df7 (diff)
downloadspark-44281ca81d4eda02b627ba21841108438b7d1c27.tar.gz
spark-44281ca81d4eda02b627ba21841108438b7d1c27.tar.bz2
spark-44281ca81d4eda02b627ba21841108438b7d1c27.zip
[SPARK-19348][PYTHON] PySpark keyword_only decorator is not thread-safe
## What changes were proposed in this pull request? The `keyword_only` decorator in PySpark is not thread-safe. It writes kwargs to a static class variable in the decorator, which is then retrieved later in the class method as `_input_kwargs`. If multiple threads are constructing the same class with different kwargs, it becomes a race condition to read from the static class variable before it's overwritten. See [SPARK-19348](https://issues.apache.org/jira/browse/SPARK-19348) for reproduction code. This change will write the kwargs to a member variable so that multiple threads can operate on separate instances without the race condition. It does not protect against multiple threads operating on a single instance, but that is better left to the user to synchronize. ## How was this patch tested? Added new unit tests for using the keyword_only decorator and a regression test that verifies `_input_kwargs` can be overwritten from different class instances. Author: Bryan Cutler <cutlerb@gmail.com> Closes #16782 from BryanCutler/pyspark-keyword_only-threadsafe-SPARK-19348.
Diffstat (limited to 'python/pyspark/__init__.py')
-rw-r--r--python/pyspark/__init__.py10
1 files changed, 6 insertions, 4 deletions
diff --git a/python/pyspark/__init__.py b/python/pyspark/__init__.py
index 9331e74eed..14c51a306e 100644
--- a/python/pyspark/__init__.py
+++ b/python/pyspark/__init__.py
@@ -93,13 +93,15 @@ def keyword_only(func):
"""
A decorator that forces keyword arguments in the wrapped method
and saves actual input keyword arguments in `_input_kwargs`.
+
+ .. note:: Should only be used to wrap a method where first arg is `self`
"""
@wraps(func)
- def wrapper(*args, **kwargs):
- if len(args) > 1:
+ def wrapper(self, *args, **kwargs):
+ if len(args) > 0:
raise TypeError("Method %s forces keyword arguments." % func.__name__)
- wrapper._input_kwargs = kwargs
- return func(*args, **kwargs)
+ self._input_kwargs = kwargs
+ return func(self, **kwargs)
return wrapper