diff options
author | Yin Huai <yhuai@databricks.com> | 2015-08-03 00:23:08 -0700 |
---|---|---|
committer | Reynold Xin <rxin@databricks.com> | 2015-08-03 00:23:08 -0700 |
commit | 1ebd41b141a95ec264bd2dd50f0fe24cd459035d (patch) | |
tree | 34452a2922a4bcd00d600ad8fa0b79df09f243bb /launcher | |
parent | 98d6d9c7a996f5456eb2653bb96985a1a05f4ce1 (diff) | |
download | spark-1ebd41b141a95ec264bd2dd50f0fe24cd459035d.tar.gz spark-1ebd41b141a95ec264bd2dd50f0fe24cd459035d.tar.bz2 spark-1ebd41b141a95ec264bd2dd50f0fe24cd459035d.zip |
[SPARK-9240] [SQL] Hybrid aggregate operator using unsafe row
This PR adds a base aggregation iterator `AggregationIterator`, which is used to create `SortBasedAggregationIterator` (for sort-based aggregation) and `UnsafeHybridAggregationIterator` (first it tries hash-based aggregation and falls back to the sort-based aggregation (using external sorter) if we cannot allocate memory for the map). With these two iterators, we will not need existing iterators and I am removing those. Also, we can use a single physical `Aggregate` operator and it internally determines what iterators to used.
https://issues.apache.org/jira/browse/SPARK-9240
Author: Yin Huai <yhuai@databricks.com>
Closes #7813 from yhuai/AggregateOperator and squashes the following commits:
e317e2b [Yin Huai] Remove unnecessary change.
74d93c5 [Yin Huai] Merge remote-tracking branch 'upstream/master' into AggregateOperator
ba6afbc [Yin Huai] Add a little bit more comments.
c9cf3b6 [Yin Huai] update
0f1b06f [Yin Huai] Remove unnecessary code.
21fd15f [Yin Huai] Remove unnecessary change.
964f88b [Yin Huai] Implement fallback strategy.
b1ea5cf [Yin Huai] wip
7fcbd87 [Yin Huai] Add a flag to control what iterator to use.
533d5b2 [Yin Huai] Prepare for fallback!
33b7022 [Yin Huai] wip
bd9282b [Yin Huai] UDAFs now supports UnsafeRow.
f52ee53 [Yin Huai] wip
3171f44 [Yin Huai] wip
d2c45a0 [Yin Huai] wip
f60cc83 [Yin Huai] Also check input schema.
af32210 [Yin Huai] Check iter.hasNext before we create an iterator because the constructor of the iterato will read at least one row from a non-empty input iter.
299008c [Yin Huai] First round cleanup.
3915bac [Yin Huai] Create a base iterator class for aggregation iterators and add the initial version of the hybrid iterator.
Diffstat (limited to 'launcher')
0 files changed, 0 insertions, 0 deletions