diff options
author | Yin Huai <yhuai@databricks.com> | 2015-04-11 19:26:15 -0700 |
---|---|---|
committer | Michael Armbrust <michael@databricks.com> | 2015-04-11 19:26:15 -0700 |
commit | 6d4e854ffbd7dee9a3cd7b44a00fd9c0e551f5b8 (patch) | |
tree | 8b0c79447539078108b8ff10e8195f5286d90b96 /examples | |
parent | d2383fb5ffafd6b3a56b1ee6e0e035594473e2c8 (diff) | |
download | spark-6d4e854ffbd7dee9a3cd7b44a00fd9c0e551f5b8.tar.gz spark-6d4e854ffbd7dee9a3cd7b44a00fd9c0e551f5b8.tar.bz2 spark-6d4e854ffbd7dee9a3cd7b44a00fd9c0e551f5b8.zip |
[SPARK-6367][SQL] Use the proper data type for those expressions that are hijacking existing data types.
This PR adds internal UDTs for expressions that are hijacking existing data types.
The following UDTs are added:
* `HyperLogLogUDT` (`BinaryType` as the SQL type) for `ApproxCountDistinctPartition`
* `OpenHashSetUDT` (`ArrayType` as the SQL type) for `CollectHashSet`, `NewSet`, `AddItemToSet`, and `CombineSets`.
I am also adding more unit tests for aggregation with code gen enabled.
JIRA: https://issues.apache.org/jira/browse/SPARK-6367
Author: Yin Huai <yhuai@databricks.com>
Closes #5094 from yhuai/expressionType and squashes the following commits:
8bcd11a [Yin Huai] Return types.
61a1d66 [Yin Huai] Merge remote-tracking branch 'upstream/master' into expressionType
e8b4599 [Yin Huai] Merge remote-tracking branch 'upstream/master' into expressionType
2753156 [Yin Huai] Ignore aggregations having sum functions for now.
b5eb259 [Yin Huai] Case object for HyperLogLog type.
00ebdbd [Yin Huai] deserialize/serialize.
54b87ae [Yin Huai] Add UDTs for expressions that return HyperLogLog and OpenHashSet.
Diffstat (limited to 'examples')
0 files changed, 0 insertions, 0 deletions