aboutsummaryrefslogtreecommitdiff
path: root/python/pyspark/sql/group.py
diff options
context:
space:
mode:
authorhyukjinkwon <gurwls223@gmail.com>2016-07-06 10:45:51 -0700
committerReynold Xin <rxin@databricks.com>2016-07-06 10:45:51 -0700
commit4e14199ff740ea186eb2cec2e5cf901b58c5f90e (patch)
treecfd7850c821e764c2243615a8fd8642d73323da1 /python/pyspark/sql/group.py
parentb1310425b30cbd711e4834d65a0accb3c5a8403a (diff)
downloadspark-4e14199ff740ea186eb2cec2e5cf901b58c5f90e.tar.gz
spark-4e14199ff740ea186eb2cec2e5cf901b58c5f90e.tar.bz2
spark-4e14199ff740ea186eb2cec2e5cf901b58c5f90e.zip
[MINOR][PYSPARK][DOC] Fix wrongly formatted examples in PySpark documentation
## What changes were proposed in this pull request? This PR fixes wrongly formatted examples in PySpark documentation as below: - **`SparkSession`** - **Before** ![2016-07-06 11 34 41](https://cloud.githubusercontent.com/assets/6477701/16605847/ae939526-436d-11e6-8ab8-6ad578362425.png) - **After** ![2016-07-06 11 33 56](https://cloud.githubusercontent.com/assets/6477701/16605845/ace9ee78-436d-11e6-8923-b76d4fc3e7c3.png) - **`Builder`** - **Before** ![2016-07-06 11 34 44](https://cloud.githubusercontent.com/assets/6477701/16605844/aba60dbc-436d-11e6-990a-c87bc0281c6b.png) - **After** ![2016-07-06 1 26 37](https://cloud.githubusercontent.com/assets/6477701/16607562/586704c0-437d-11e6-9483-e0af93d8f74e.png) This PR also fixes several similar instances across the documentation in `sql` PySpark module. ## How was this patch tested? N/A Author: hyukjinkwon <gurwls223@gmail.com> Closes #14063 from HyukjinKwon/minor-pyspark-builder.
Diffstat (limited to 'python/pyspark/sql/group.py')
-rw-r--r--python/pyspark/sql/group.py2
1 files changed, 2 insertions, 0 deletions
diff --git a/python/pyspark/sql/group.py b/python/pyspark/sql/group.py
index a423206554..f2092f9c63 100644
--- a/python/pyspark/sql/group.py
+++ b/python/pyspark/sql/group.py
@@ -179,10 +179,12 @@ class GroupedData(object):
:param values: List of values that will be translated to columns in the output DataFrame.
# Compute the sum of earnings for each year by course with each course as a separate column
+
>>> df4.groupBy("year").pivot("course", ["dotNET", "Java"]).sum("earnings").collect()
[Row(year=2012, dotNET=15000, Java=20000), Row(year=2013, dotNET=48000, Java=30000)]
# Or without specifying column values (less efficient)
+
>>> df4.groupBy("year").pivot("course").sum("earnings").collect()
[Row(year=2012, Java=20000, dotNET=15000), Row(year=2013, Java=30000, dotNET=48000)]
"""