[SPARK-17187][SQL] Supports using arbitrary Java object as internal aggregation buffer object - spark

diff options

author	Sean Zhong <seanzhong@databricks.com>	2016-08-25 16:36:16 -0700
committer	Yin Huai <yhuai@databricks.com>	2016-08-25 16:36:16 -0700
commit	d96d1515638da20b594f7bfe3cfdb50088f25a04 (patch)
tree	69e7803b4f49d0ed03073795843eb95d8f63529f /sbin/spark-config.sh
parent	9b5a1d1d53bc4412de3cbc86dc819b0c213229a8 (diff)
download	spark-d96d1515638da20b594f7bfe3cfdb50088f25a04.tar.gz spark-d96d1515638da20b594f7bfe3cfdb50088f25a04.tar.bz2 spark-d96d1515638da20b594f7bfe3cfdb50088f25a04.zip

[SPARK-17187][SQL] Supports using arbitrary Java object as internal aggregation buffer object

## What changes were proposed in this pull request? This PR introduces an abstract class `TypedImperativeAggregate` so that an aggregation function of TypedImperativeAggregate can use **arbitrary** user-defined Java object as intermediate aggregation buffer object. **This has advantages like:** 1. It now can support larger category of aggregation functions. For example, it will be much easier to implement aggregation function `percentile_approx`, which has a complex aggregation buffer definition. 2. It can be used to avoid doing serialization/de-serialization for every call of `update` or `merge` when converting domain specific aggregation object to internal Spark-Sql storage format. 3. It is easier to integrate with other existing monoid libraries like algebird, and supports more aggregation functions with high performance. Please see `org.apache.spark.sql.TypedImperativeAggregateSuite.TypedMaxAggregate` to find an example of how to defined a `TypedImperativeAggregate` aggregation function. Please see Java doc of `TypedImperativeAggregate` and Jira ticket SPARK-17187 for more information. ## How was this patch tested? Unit tests. Author: Sean Zhong <seanzhong@databricks.com> Author: Yin Huai <yhuai@databricks.com> Closes #14753 from clockfly/object_aggregation_buffer_try_2.

Diffstat (limited to 'sbin/spark-config.sh')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: