aboutsummaryrefslogtreecommitdiff
path: root/python
diff options
context:
space:
mode:
authorasokadiggs <asoka.diggs@intel.com>2015-09-29 17:45:18 -0400
committerSean Owen <sowen@cloudera.com>2015-09-29 17:45:18 -0400
commitc1ad373f26053e1906fce7681c03d130a642bf33 (patch)
tree22e7a2411c6422c19e5daf258d39cdac40f22bd8 /python
parent7d399c9daa6769ab234890c551e1b3456e0e6e85 (diff)
downloadspark-c1ad373f26053e1906fce7681c03d130a642bf33.tar.gz
spark-c1ad373f26053e1906fce7681c03d130a642bf33.tar.bz2
spark-c1ad373f26053e1906fce7681c03d130a642bf33.zip
[SPARK-10782] [PYTHON] Update dropDuplicates documentation
Documentation for dropDuplicates() and drop_duplicates() is one and the same. Resolved the error in the example for drop_duplicates using the same approach used for groupby and groupBy, by indicating that dropDuplicates and drop_duplicates are aliases. Author: asokadiggs <asoka.diggs@intel.com> Closes #8930 from asokadiggs/jira-10782.
Diffstat (limited to 'python')
-rw-r--r--python/pyspark/sql/dataframe.py2
1 files changed, 2 insertions, 0 deletions
diff --git a/python/pyspark/sql/dataframe.py b/python/pyspark/sql/dataframe.py
index b09422aade..033b31983f 100644
--- a/python/pyspark/sql/dataframe.py
+++ b/python/pyspark/sql/dataframe.py
@@ -931,6 +931,8 @@ class DataFrame(object):
"""Return a new :class:`DataFrame` with duplicate rows removed,
optionally only considering certain columns.
+ :func:`drop_duplicates` is an alias for :func:`dropDuplicates`.
+
>>> from pyspark.sql import Row
>>> df = sc.parallelize([ \
Row(name='Alice', age=5, height=80), \