aboutsummaryrefslogtreecommitdiff
path: root/dev/run-tests.py
diff options
context:
space:
mode:
authorHerman van Hovell <hvanhovell@databricks.com>2016-10-05 16:05:30 -0700
committerYin Huai <yhuai@databricks.com>2016-10-05 16:05:30 -0700
commit5fd54b994e2078dbf0794932b4e0ffa9a9eda0c3 (patch)
treec3578544cb4d4b4431a5debd0ce6ea7ff4334e0b /dev/run-tests.py
parent221b418b1c9db7b04c600b6300d18b034a4f444e (diff)
downloadspark-5fd54b994e2078dbf0794932b4e0ffa9a9eda0c3.tar.gz
spark-5fd54b994e2078dbf0794932b4e0ffa9a9eda0c3.tar.bz2
spark-5fd54b994e2078dbf0794932b4e0ffa9a9eda0c3.zip
[SPARK-17758][SQL] Last returns wrong result in case of empty partition
## What changes were proposed in this pull request? The result of the `Last` function can be wrong when the last partition processed is empty. It can return `null` instead of the expected value. For example, this can happen when we process partitions in the following order: ``` - Partition 1 [Row1, Row2] - Partition 2 [Row3] - Partition 3 [] ``` In this case the `Last` function will currently return a null, instead of the value of `Row3`. This PR fixes this by adding a `valueSet` flag to the `Last` function. ## How was this patch tested? We only used end to end tests for `DeclarativeAggregateFunction`s. I have added an evaluator for these functions so we can tests them in catalyst. I have added a `LastTestSuite` to test the `Last` aggregate function. Author: Herman van Hovell <hvanhovell@databricks.com> Closes #15348 from hvanhovell/SPARK-17758.
Diffstat (limited to 'dev/run-tests.py')
0 files changed, 0 insertions, 0 deletions