aboutsummaryrefslogtreecommitdiff
path: root/sql/catalyst/pom.xml
diff options
context:
space:
mode:
authorDavies Liu <davies@databricks.com>2016-03-28 13:07:32 -0700
committerDavies Liu <davies.liu@gmail.com>2016-03-28 13:07:32 -0700
commitd7b58f1461f71ee3c028360eef0ffedd17d6a076 (patch)
tree58ddca8bb29534ecb77446e6706f33d885e01bd4 /sql/catalyst/pom.xml
parent600c0b69cab4767e8e5a6f4284777d8b9d4bd40e (diff)
downloadspark-d7b58f1461f71ee3c028360eef0ffedd17d6a076.tar.gz
spark-d7b58f1461f71ee3c028360eef0ffedd17d6a076.tar.bz2
spark-d7b58f1461f71ee3c028360eef0ffedd17d6a076.zip
[SPARK-14052] [SQL] build a BytesToBytesMap directly in HashedRelation
## What changes were proposed in this pull request? Currently, for the key that can not fit within a long, we build a hash map for UnsafeHashedRelation, it's converted to BytesToBytesMap after serialization and deserialization. We should build a BytesToBytesMap directly to have better memory efficiency. In order to do that, BytesToBytesMap should support multiple (K,V) pair with the same K, Location.putNewKey() is renamed to Location.append(), which could append multiple values for the same key (same Location). `Location.newValue()` is added to find the next value for the same key. ## How was this patch tested? Existing tests. Added benchmark for broadcast hash join with duplicated keys. Author: Davies Liu <davies@databricks.com> Closes #11870 from davies/map2.
Diffstat (limited to 'sql/catalyst/pom.xml')
0 files changed, 0 insertions, 0 deletions