diff options
author | Liang-Chi Hsieh <simonh@tw.ibm.com> | 2016-04-01 14:02:32 -0700 |
---|---|---|
committer | Davies Liu <davies.liu@gmail.com> | 2016-04-01 14:02:32 -0700 |
commit | 3e991dbc310a4a33eec7f3909adce50bf8268d04 (patch) | |
tree | fde4fe5795f815d06f5be22020618d52411ec0ae /sql/core/src/test/scala/org | |
parent | 1b829ce13990b40fd8d7c9efcc2ae55c4dbc861c (diff) | |
download | spark-3e991dbc310a4a33eec7f3909adce50bf8268d04.tar.gz spark-3e991dbc310a4a33eec7f3909adce50bf8268d04.tar.bz2 spark-3e991dbc310a4a33eec7f3909adce50bf8268d04.zip |
[SPARK-13674] [SQL] Add wholestage codegen support to Sample
JIRA: https://issues.apache.org/jira/browse/SPARK-13674
## What changes were proposed in this pull request?
Sample operator doesn't support wholestage codegen now. This pr is to add support to it.
## How was this patch tested?
A test is added into `BenchmarkWholeStageCodegen`. Besides, all tests should be passed.
Author: Liang-Chi Hsieh <simonh@tw.ibm.com>
Author: Liang-Chi Hsieh <viirya@gmail.com>
Closes #11517 from viirya/add-wholestage-sample.
Diffstat (limited to 'sql/core/src/test/scala/org')
-rw-r--r-- | sql/core/src/test/scala/org/apache/spark/sql/execution/BenchmarkWholeStageCodegen.scala | 25 |
1 files changed, 25 insertions, 0 deletions
diff --git a/sql/core/src/test/scala/org/apache/spark/sql/execution/BenchmarkWholeStageCodegen.scala b/sql/core/src/test/scala/org/apache/spark/sql/execution/BenchmarkWholeStageCodegen.scala index 003d3e062e..55906793c0 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/execution/BenchmarkWholeStageCodegen.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/execution/BenchmarkWholeStageCodegen.scala @@ -85,6 +85,31 @@ class BenchmarkWholeStageCodegen extends SparkFunSuite { */ } + ignore("range/sample/sum") { + val N = 500 << 20 + runBenchmark("range/sample/sum", N) { + sqlContext.range(N).sample(true, 0.01).groupBy().sum().collect() + } + /* + Westmere E56xx/L56xx/X56xx (Nehalem-C) + range/sample/sum: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative + ------------------------------------------------------------------------------------------- + range/sample/sum codegen=false 53888 / 56592 9.7 102.8 1.0X + range/sample/sum codegen=true 41614 / 42607 12.6 79.4 1.3X + */ + + runBenchmark("range/sample/sum", N) { + sqlContext.range(N).sample(false, 0.01).groupBy().sum().collect() + } + /* + Westmere E56xx/L56xx/X56xx (Nehalem-C) + range/sample/sum: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative + ------------------------------------------------------------------------------------------- + range/sample/sum codegen=false 12982 / 13384 40.4 24.8 1.0X + range/sample/sum codegen=true 7074 / 7383 74.1 13.5 1.8X + */ + } + ignore("stat functions") { val N = 100L << 20 |