diff options
author | JihongMa <linlin200605@gmail.com> | 2015-11-18 13:03:37 -0800 |
---|---|---|
committer | Xiangrui Meng <meng@databricks.com> | 2015-11-18 13:03:37 -0800 |
commit | 09ad9533d5760652de59fa4830c24cb8667958ac (patch) | |
tree | 6e6023e1d2df2ccf565f9df1bf26e82904a70363 /python/pyspark | |
parent | 7c5b641808740ba5eed05ba8204cdbaf3fc579f5 (diff) | |
download | spark-09ad9533d5760652de59fa4830c24cb8667958ac.tar.gz spark-09ad9533d5760652de59fa4830c24cb8667958ac.tar.bz2 spark-09ad9533d5760652de59fa4830c24cb8667958ac.zip |
[SPARK-11720][SQL][ML] Handle edge cases when count = 0 or 1 for Stats function
return Double.NaN for mean/average when count == 0 for all numeric types that is converted to Double, Decimal type continue to return null.
Author: JihongMa <linlin200605@gmail.com>
Closes #9705 from JihongMA/SPARK-11720.
Diffstat (limited to 'python/pyspark')
-rw-r--r-- | python/pyspark/sql/dataframe.py | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/python/pyspark/sql/dataframe.py b/python/pyspark/sql/dataframe.py index ad6ad0235a..0dd75ba7ca 100644 --- a/python/pyspark/sql/dataframe.py +++ b/python/pyspark/sql/dataframe.py @@ -761,7 +761,7 @@ class DataFrame(object): +-------+------------------+-----+ | count| 2| 2| | mean| 3.5| null| - | stddev|2.1213203435596424| NaN| + | stddev|2.1213203435596424| null| | min| 2|Alice| | max| 5| Bob| +-------+------------------+-----+ |