aboutsummaryrefslogtreecommitdiff
path: root/sql/hive
diff options
context:
space:
mode:
authorAaron Davidson <aaron@databricks.com>2014-07-14 23:38:12 -0700
committerPatrick Wendell <pwendell@gmail.com>2014-07-14 23:38:24 -0700
commit0e2727959a4c2eac41bb6ec70054a1e467637099 (patch)
tree95d4339c8e7561ae0425e309bb1087fbf0005499 /sql/hive
parent2ec7d7ab751be67a86a048eed85bd9fd36dfaf83 (diff)
downloadspark-0e2727959a4c2eac41bb6ec70054a1e467637099.tar.gz
spark-0e2727959a4c2eac41bb6ec70054a1e467637099.tar.bz2
spark-0e2727959a4c2eac41bb6ec70054a1e467637099.zip
Add/increase severity of warning in documentation of groupBy()
groupBy()/groupByKey() is notorious for being a very convenient API that can lead to poor performance when used incorrectly. This PR just makes it clear that users should be cautious not to rely on this API when they really want a different (more performant) one, such as reduceByKey(). (Note that one source of confusion is the name; this groupBy() is not the same as a SQL GROUP-BY, which is used for aggregation and is more similar in nature to Spark's reduceByKey().) Author: Aaron Davidson <aaron@databricks.com> Closes #1380 from aarondav/warning and squashes the following commits: f60da39 [Aaron Davidson] Give better advice d0afb68 [Aaron Davidson] Add/increase severity of warning in documentation of groupBy() (cherry picked from commit a2aa7bebae31e1e7ec23d31aaa436283743b283b) Signed-off-by: Patrick Wendell <pwendell@gmail.com>
Diffstat (limited to 'sql/hive')
0 files changed, 0 insertions, 0 deletions