diff options
author | Wenchen Fan <wenchen@databricks.com> | 2016-01-25 17:58:11 -0800 |
---|---|---|
committer | Reynold Xin <rxin@databricks.com> | 2016-01-25 17:58:11 -0800 |
commit | 109061f7ad27225669cbe609ec38756b31d4e1b9 (patch) | |
tree | 766857a292ba2988e3afb151867f7b4342291238 /.gitignore | |
parent | be375fcbd200fb0e210b8edcfceb5a1bcdbba94b (diff) | |
download | spark-109061f7ad27225669cbe609ec38756b31d4e1b9.tar.gz spark-109061f7ad27225669cbe609ec38756b31d4e1b9.tar.bz2 spark-109061f7ad27225669cbe609ec38756b31d4e1b9.zip |
[SPARK-12936][SQL] Initial bloom filter implementation
This PR adds an initial implementation of bloom filter in the newly added sketch module. The implementation is based on the [`BloomFilter` class in guava](https://code.google.com/p/guava-libraries/source/browse/guava/src/com/google/common/hash/BloomFilter.java).
Some difference from the design doc:
* expose `bitSize` instead of `sizeInBytes` to user.
* always need the `expectedInsertions` parameter when create bloom filter.
Author: Wenchen Fan <wenchen@databricks.com>
Closes #10883 from cloud-fan/bloom-filter.
Diffstat (limited to '.gitignore')
0 files changed, 0 insertions, 0 deletions