[SPARK-11489][SQL] Only include common first order statistics in GroupedData - spark

diff options

author	Reynold Xin <rxin@databricks.com>	2015-11-03 16:27:56 -0800
committer	Reynold Xin <rxin@databricks.com>	2015-11-03 16:27:56 -0800
commit	5051262d4ca6a2c529c9b1ba86d54cce60a7af17 (patch)
tree	f7c89be1ccc400a803aaa136926b84405a7e43e1 /ec2/spark-ec2
parent	53e9cee3e4e845d1f875c487215c0f22503347b1 (diff)
download	spark-5051262d4ca6a2c529c9b1ba86d54cce60a7af17.tar.gz spark-5051262d4ca6a2c529c9b1ba86d54cce60a7af17.tar.bz2 spark-5051262d4ca6a2c529c9b1ba86d54cce60a7af17.zip

[SPARK-11489][SQL] Only include common first order statistics in GroupedData

We added a bunch of higher order statistics such as skewness and kurtosis to GroupedData. I don't think they are common enough to justify being listed, since users can always use the normal statistics aggregate functions. That is to say, after this change, we won't support ```scala df.groupBy("key").kurtosis("colA", "colB") ``` However, we will still support ```scala df.groupBy("key").agg(kurtosis(col("colA")), kurtosis(col("colB"))) ``` Author: Reynold Xin <rxin@databricks.com> Closes #9446 from rxin/SPARK-11489.

Diffstat (limited to 'ec2/spark-ec2')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: