aboutsummaryrefslogtreecommitdiff
path: root/mllib/src/test
diff options
context:
space:
mode:
authorLiang-Chi Hsieh <viirya@gmail.com>2016-03-01 08:43:02 -0800
committerDavies Liu <davies.liu@gmail.com>2016-03-01 08:43:02 -0800
commitc43899a04e4de18e238a1761bf4fe9f54e182320 (patch)
tree34c22b64f5034e14f7fe33b2975469ee0e09d2f5 /mllib/src/test
parent12a2a57e1af21da0aa4275971365d76a8fc84a43 (diff)
downloadspark-c43899a04e4de18e238a1761bf4fe9f54e182320.tar.gz
spark-c43899a04e4de18e238a1761bf4fe9f54e182320.tar.bz2
spark-c43899a04e4de18e238a1761bf4fe9f54e182320.zip
[SPARK-13511] [SQL] Add wholestage codegen for limit
JIRA: https://issues.apache.org/jira/browse/SPARK-13511 ## What changes were proposed in this pull request? Current limit operator doesn't support wholestage codegen. This is open to add support for it. In the `doConsume` of `GlobalLimit` and `LocalLimit`, we use a count term to count the processed rows. Once the row numbers catches the limit number, we set the variable `stopEarly` of `BufferedRowIterator` newly added in this pr to `true` that indicates we want to stop processing remaining rows. Then when the wholestage codegen framework checks `shouldStop()`, it will stop the processing of the row iterator. Before this, the executed plan for a query `sqlContext.range(N).limit(100).groupBy().sum()` is: TungstenAggregate(key=[], functions=[(sum(id#5L),mode=Final,isDistinct=false)], output=[sum(id)#6L]) +- TungstenAggregate(key=[], functions=[(sum(id#5L),mode=Partial,isDistinct=false)], output=[sum#9L]) +- GlobalLimit 100 +- Exchange SinglePartition, None +- LocalLimit 100 +- Range 0, 1, 1, 524288000, [id#5L] After add wholestage codegen support: WholeStageCodegen : +- TungstenAggregate(key=[], functions=[(sum(id#40L),mode=Final,isDistinct=false)], output=[sum(id)#41L]) : +- TungstenAggregate(key=[], functions=[(sum(id#40L),mode=Partial,isDistinct=false)], output=[sum#44L]) : +- GlobalLimit 100 : +- INPUT +- Exchange SinglePartition, None +- WholeStageCodegen : +- LocalLimit 100 : +- Range 0, 1, 1, 524288000, [id#40L] ## How was this patch tested? A test is added into BenchmarkWholeStageCodegen. Author: Liang-Chi Hsieh <viirya@gmail.com> Closes #11391 from viirya/wholestage-limit.
Diffstat (limited to 'mllib/src/test')
0 files changed, 0 insertions, 0 deletions