aboutsummaryrefslogtreecommitdiff
path: root/python/pyspark/sql/dataframe.py
diff options
context:
space:
mode:
authorLiang-Chi Hsieh <viirya@appier.com>2015-09-21 23:46:00 -0700
committerReynold Xin <rxin@databricks.com>2015-09-21 23:46:00 -0700
commit1fcefef06950e2f03477282368ca835bbf40ff24 (patch)
tree9fd7829e91af85b3f682f677721ca00b1c96eb30 /python/pyspark/sql/dataframe.py
parent781b21ba2a873ed29394c8dbc74fc700e3e0d17e (diff)
downloadspark-1fcefef06950e2f03477282368ca835bbf40ff24.tar.gz
spark-1fcefef06950e2f03477282368ca835bbf40ff24.tar.bz2
spark-1fcefef06950e2f03477282368ca835bbf40ff24.zip
[SPARK-10446][SQL] Support to specify join type when calling join with usingColumns
JIRA: https://issues.apache.org/jira/browse/SPARK-10446 Currently the method `join(right: DataFrame, usingColumns: Seq[String])` only supports inner join. It is more convenient to have it support other join types. Author: Liang-Chi Hsieh <viirya@appier.com> Closes #8600 from viirya/usingcolumns_df.
Diffstat (limited to 'python/pyspark/sql/dataframe.py')
-rw-r--r--python/pyspark/sql/dataframe.py6
1 files changed, 5 insertions, 1 deletions
diff --git a/python/pyspark/sql/dataframe.py b/python/pyspark/sql/dataframe.py
index fb995fa3a7..80f8d8a0eb 100644
--- a/python/pyspark/sql/dataframe.py
+++ b/python/pyspark/sql/dataframe.py
@@ -567,7 +567,11 @@ class DataFrame(object):
if on is None or len(on) == 0:
jdf = self._jdf.join(other._jdf)
elif isinstance(on[0], basestring):
- jdf = self._jdf.join(other._jdf, self._jseq(on))
+ if how is None:
+ jdf = self._jdf.join(other._jdf, self._jseq(on), "inner")
+ else:
+ assert isinstance(how, basestring), "how should be basestring"
+ jdf = self._jdf.join(other._jdf, self._jseq(on), how)
else:
assert isinstance(on[0], Column), "on should be Column or list of Column"
if len(on) > 1: