aboutsummaryrefslogtreecommitdiff
path: root/sql
diff options
context:
space:
mode:
authorXiangrui Meng <meng@databricks.com>2014-07-31 12:55:00 -0700
committerXiangrui Meng <meng@databricks.com>2014-07-31 12:55:00 -0700
commitdc0865bc7e119fe507061c27069c17523b87dfea (patch)
tree481dfc65f65273dda1fbfae7e22c780aee7f7168 /sql
parente5749a1342327263dc6b94ba470e392fbea703fa (diff)
downloadspark-dc0865bc7e119fe507061c27069c17523b87dfea.tar.gz
spark-dc0865bc7e119fe507061c27069c17523b87dfea.tar.bz2
spark-dc0865bc7e119fe507061c27069c17523b87dfea.zip
[SPARK-2511][MLLIB] add HashingTF and IDF
This is roughly the TF-IDF implementation used in the Databricks Cloud Demo: http://databricks.com/cloud/ . Both `HashingTF` and `IDF` are implemented as transformers, similar to scikit-learn. Author: Xiangrui Meng <meng@databricks.com> Closes #1671 from mengxr/tfidf and squashes the following commits: 7d65888 [Xiangrui Meng] use JavaConverters._ 5fe9ec4 [Xiangrui Meng] fix unit test 6e214ec [Xiangrui Meng] add apache header cfd9aed [Xiangrui Meng] add Java-friendly methods move classes to mllib.feature 3814440 [Xiangrui Meng] add HashingTF and IDF
Diffstat (limited to 'sql')
0 files changed, 0 insertions, 0 deletions