aboutsummaryrefslogtreecommitdiff
path: root/python/pyspark/sql/functions.py
diff options
context:
space:
mode:
authoranabranch <wac.chambers@gmail.com>2017-01-08 17:53:53 -0800
committerReynold Xin <rxin@databricks.com>2017-01-08 17:53:53 -0800
commit1f6ded6455d07ec8828fc9662ddffe55cbba4238 (patch)
tree0f6545dfdefad8987c26226454a9becf73ade03a /python/pyspark/sql/functions.py
parent4351e62207957bec663108a571cff2bfaaa9e7d5 (diff)
downloadspark-1f6ded6455d07ec8828fc9662ddffe55cbba4238.tar.gz
spark-1f6ded6455d07ec8828fc9662ddffe55cbba4238.tar.bz2
spark-1f6ded6455d07ec8828fc9662ddffe55cbba4238.zip
[SPARK-19127][DOCS] Update Rank Function Documentation
## What changes were proposed in this pull request? - [X] Fix inconsistencies in function reference for dense rank and dense - [X] Make all languages equivalent in their reference to `dense_rank` and `rank`. ## How was this patch tested? N/A for docs. Please review http://spark.apache.org/contributing.html before opening a pull request. Author: anabranch <wac.chambers@gmail.com> Closes #16505 from anabranch/SPARK-19127.
Diffstat (limited to 'python/pyspark/sql/functions.py')
-rw-r--r--python/pyspark/sql/functions.py16
1 files changed, 10 insertions, 6 deletions
diff --git a/python/pyspark/sql/functions.py b/python/pyspark/sql/functions.py
index d8abafcde3..7fe901a4fb 100644
--- a/python/pyspark/sql/functions.py
+++ b/python/pyspark/sql/functions.py
@@ -157,17 +157,21 @@ _window_functions = {
'dense_rank':
"""returns the rank of rows within a window partition, without any gaps.
- The difference between rank and denseRank is that denseRank leaves no gaps in ranking
- sequence when there are ties. That is, if you were ranking a competition using denseRank
+ The difference between rank and dense_rank is that dense_rank leaves no gaps in ranking
+ sequence when there are ties. That is, if you were ranking a competition using dense_rank
and had three people tie for second place, you would say that all three were in second
- place and that the next person came in third.""",
+ place and that the next person came in third. Rank would give me sequential numbers, making
+ the person that came in third place (after the ties) would register as coming in fifth.
+
+ This is equivalent to the DENSE_RANK function in SQL.""",
'rank':
"""returns the rank of rows within a window partition.
- The difference between rank and denseRank is that denseRank leaves no gaps in ranking
- sequence when there are ties. That is, if you were ranking a competition using denseRank
+ The difference between rank and dense_rank is that dense_rank leaves no gaps in ranking
+ sequence when there are ties. That is, if you were ranking a competition using dense_rank
and had three people tie for second place, you would say that all three were in second
- place and that the next person came in third.
+ place and that the next person came in third. Rank would give me sequential numbers, making
+ the person that came in third place (after the ties) would register as coming in fifth.
This is equivalent to the RANK function in SQL.""",
'cume_dist':