[SPARK-13897][SQL] RelationalGroupedDataset and KeyValueGroupedDataset

## What changes were proposed in this pull request? Previously, Dataset.groupBy returns a GroupedData, and Dataset.groupByKey returns a GroupedDataset. The naming is very similar, and unfortunately does not convey the real differences between the two. Assume we are grouping by some keys (K). groupByKey is a key-value style group by, in which the schema of the returned dataset is a tuple of just two fields: key and value. groupBy, on the other hand, is a relational style group by, in which the schema of the returned dataset is flattened and contain |K| + |V| fields. This pull request also removes the experimental tag from RelationalGroupedDataset. It has been with DataFrame since 1.3, and we have enough confidence now to stabilize it. ## How was this patch tested? This is a rename to improve API understandability. Should be covered by all existing tests. Author: Reynold Xin <rxin@databricks.com> Closes #11841 from rxin/SPARK-13897.
author: Reynold Xin <rxin@databricks.com> 2016-03-19 11:23:14 -0700
committer: Reynold Xin <rxin@databricks.com> 2016-03-19 11:23:14 -0700
commit: dcaa016610ac2c11d7dd01803f3515b02ab32e64 (patch)
tree: 7d03000193cdcc5100fd7198e143680b2e5882e5 /project/MimaExcludes.scala
parent: 2082a49569cb5d900e318af9da1027821dfe93bc (diff)
download: spark-dcaa016610ac2c11d7dd01803f3515b02ab32e64.tar.gz
spark-dcaa016610ac2c11d7dd01803f3515b02ab32e64.tar.bz2
spark-dcaa016610ac2c11d7dd01803f3515b02ab32e64.zip
1 files changed, 1 insertions, 0 deletions
diff --git a/project/MimaExcludes.scala b/project/MimaExcludes.scala
index b38eec34a0..9a091bf6d3 100644
--- a/project/MimaExcludes.scala
+++ b/project/MimaExcludes.scala
@@ -315,6 +315,7 @@ object MimaExcludes {
         ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.DataFrame"),
         ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.DataFrame$"),
         ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.LegacyFunctions"),
+        ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.GroupedDataset"),
 
         ProblemFilters.exclude[IncompatibleMethTypeProblem]("org.apache.spark.mllib.evaluation.MultilabelMetrics.this"),
         ProblemFilters.exclude[IncompatibleResultTypeProblem]("org.apache.spark.ml.classification.LogisticRegressionSummary.predictions"),
author	Reynold Xin <rxin@databricks.com>	2016-03-19 11:23:14 -0700
committer	Reynold Xin <rxin@databricks.com>	2016-03-19 11:23:14 -0700
commit	dcaa016610ac2c11d7dd01803f3515b02ab32e64 (patch)
tree	7d03000193cdcc5100fd7198e143680b2e5882e5 /project/MimaExcludes.scala
parent	2082a49569cb5d900e318af9da1027821dfe93bc (diff)
download	spark-dcaa016610ac2c11d7dd01803f3515b02ab32e64.tar.gz spark-dcaa016610ac2c11d7dd01803f3515b02ab32e64.tar.bz2 spark-dcaa016610ac2c11d7dd01803f3515b02ab32e64.zip