[SPARK-10782] [PYTHON] Update dropDuplicates documentation

Documentation for dropDuplicates() and drop_duplicates() is one and the same. Resolved the error in the example for drop_duplicates using the same approach used for groupby and groupBy, by indicating that dropDuplicates and drop_duplicates are aliases. Author: asokadiggs <asoka.diggs@intel.com> Closes #8930 from asokadiggs/jira-10782.
author: asokadiggs <asoka.diggs@intel.com> 2015-09-29 17:45:18 -0400
committer: Sean Owen <sowen@cloudera.com> 2015-09-29 17:45:18 -0400
commit: c1ad373f26053e1906fce7681c03d130a642bf33 (patch)
tree: 22e7a2411c6422c19e5daf258d39cdac40f22bd8 /python
parent: 7d399c9daa6769ab234890c551e1b3456e0e6e85 (diff)
download: spark-c1ad373f26053e1906fce7681c03d130a642bf33.tar.gz
spark-c1ad373f26053e1906fce7681c03d130a642bf33.tar.bz2
spark-c1ad373f26053e1906fce7681c03d130a642bf33.zip
1 files changed, 2 insertions, 0 deletions
diff --git a/python/pyspark/sql/dataframe.py b/python/pyspark/sql/dataframe.py
index b09422aade..033b31983f 100644
--- a/python/pyspark/sql/dataframe.py
+++ b/python/pyspark/sql/dataframe.py
@@ -931,6 +931,8 @@ class DataFrame(object):
         """Return a new :class:`DataFrame` with duplicate rows removed,
         optionally only considering certain columns.
 
+        :func:`drop_duplicates` is an alias for :func:`dropDuplicates`.
+
         >>> from pyspark.sql import Row
         >>> df = sc.parallelize([ \
             Row(name='Alice', age=5, height=80), \
author	asokadiggs <asoka.diggs@intel.com>	2015-09-29 17:45:18 -0400
committer	Sean Owen <sowen@cloudera.com>	2015-09-29 17:45:18 -0400
commit	c1ad373f26053e1906fce7681c03d130a642bf33 (patch)
tree	22e7a2411c6422c19e5daf258d39cdac40f22bd8 /python
parent	7d399c9daa6769ab234890c551e1b3456e0e6e85 (diff)
download	spark-c1ad373f26053e1906fce7681c03d130a642bf33.tar.gz spark-c1ad373f26053e1906fce7681c03d130a642bf33.tar.bz2 spark-c1ad373f26053e1906fce7681c03d130a642bf33.zip