diff options
author | Davies Liu <davies@databricks.com> | 2016-02-08 14:09:14 -0800 |
---|---|---|
committer | Davies Liu <davies.liu@gmail.com> | 2016-02-08 14:09:14 -0800 |
commit | ff0af0ddfa4d198b203c3a39f8532cfbd4f4e027 (patch) | |
tree | bed882aeeb85eeb67562b1d2c58390d257896bca /project/MimaExcludes.scala | |
parent | 37bc203c8dd5022cb11d53b697c28a737ee85bcc (diff) | |
download | spark-ff0af0ddfa4d198b203c3a39f8532cfbd4f4e027.tar.gz spark-ff0af0ddfa4d198b203c3a39f8532cfbd4f4e027.tar.bz2 spark-ff0af0ddfa4d198b203c3a39f8532cfbd4f4e027.zip |
[SPARK-13095] [SQL] improve performance for broadcast join with dimension table
This PR improve the performance for Broadcast join with dimension tables, which is common in data warehouse.
If the join key can fit in a long, we will use a special api `get(Long)` to get the rows from HashedRelation.
If the HashedRelation only have unique keys, we will use a special api `getValue(Long)` or `getValue(InternalRow)`.
If the keys can fit within a long, also the keys are dense, we will use a array of UnsafeRow, instead a hash map.
TODO: will do cleanup
Author: Davies Liu <davies@databricks.com>
Closes #11065 from davies/gen_dim.
Diffstat (limited to 'project/MimaExcludes.scala')
0 files changed, 0 insertions, 0 deletions