Add/increase severity of warning in documentation of groupBy() - spark

diff options

author	Aaron Davidson <aaron@databricks.com>	2014-07-14 23:38:12 -0700
committer	Patrick Wendell <pwendell@gmail.com>	2014-07-14 23:38:24 -0700
commit	0e2727959a4c2eac41bb6ec70054a1e467637099 (patch)
tree	95d4339c8e7561ae0425e309bb1087fbf0005499 /sql/hive
parent	2ec7d7ab751be67a86a048eed85bd9fd36dfaf83 (diff)
download	spark-0e2727959a4c2eac41bb6ec70054a1e467637099.tar.gz spark-0e2727959a4c2eac41bb6ec70054a1e467637099.tar.bz2 spark-0e2727959a4c2eac41bb6ec70054a1e467637099.zip

Add/increase severity of warning in documentation of groupBy()

groupBy()/groupByKey() is notorious for being a very convenient API that can lead to poor performance when used incorrectly. This PR just makes it clear that users should be cautious not to rely on this API when they really want a different (more performant) one, such as reduceByKey(). (Note that one source of confusion is the name; this groupBy() is not the same as a SQL GROUP-BY, which is used for aggregation and is more similar in nature to Spark's reduceByKey().) Author: Aaron Davidson <aaron@databricks.com> Closes #1380 from aarondav/warning and squashes the following commits: f60da39 [Aaron Davidson] Give better advice d0afb68 [Aaron Davidson] Add/increase severity of warning in documentation of groupBy() (cherry picked from commit a2aa7bebae31e1e7ec23d31aaa436283743b283b) Signed-off-by: Patrick Wendell <pwendell@gmail.com>

Diffstat (limited to 'sql/hive')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: