aboutsummaryrefslogtreecommitdiff
path: root/python/pyspark/ml/regression.py
diff options
context:
space:
mode:
authorWenchen Fan <wenchen@databricks.com>2016-04-15 12:10:00 +0800
committerWenchen Fan <wenchen@databricks.com>2016-04-15 12:10:00 +0800
commit297ba3f1b49cc37d9891a529142c553e0a5e2d62 (patch)
tree2a61d490100de8b609a15fb52561524dddaca0e8 /python/pyspark/ml/regression.py
parentb5c60bcdca3bcace607b204a6c196a5386e8a896 (diff)
downloadspark-297ba3f1b49cc37d9891a529142c553e0a5e2d62.tar.gz
spark-297ba3f1b49cc37d9891a529142c553e0a5e2d62.tar.bz2
spark-297ba3f1b49cc37d9891a529142c553e0a5e2d62.zip
[SPARK-14275][SQL] Reimplement TypedAggregateExpression to DeclarativeAggregate
## What changes were proposed in this pull request? `ExpressionEncoder` is just a container for serialization and deserialization expressions, we can use these expressions to build `TypedAggregateExpression` directly, so that it can fit in `DeclarativeAggregate`, which is more efficient. One trick is, for each buffer serializer expression, it will reference to the result object of serialization and function call. To avoid re-calculating this result object, we can serialize the buffer object to a single struct field, so that we can use a special `Expression` to only evaluate result object once. ## How was this patch tested? existing tests Author: Wenchen Fan <wenchen@databricks.com> Closes #12067 from cloud-fan/typed_udaf.
Diffstat (limited to 'python/pyspark/ml/regression.py')
0 files changed, 0 insertions, 0 deletions