aboutsummaryrefslogtreecommitdiff
path: root/pom.xml
diff options
context:
space:
mode:
authorJacky Li <jacky.likun@huawei.com>2015-02-01 20:07:25 -0800
committerXiangrui Meng <meng@databricks.com>2015-02-01 20:07:25 -0800
commit859f7249a614c86fc1691cc3116463f85f33f153 (patch)
tree7f16495e4023248f5620b5454f582070b4bdf68f /pom.xml
parentd85cd4eb1479f8d37dab360530dc2c71216b4a8d (diff)
downloadspark-859f7249a614c86fc1691cc3116463f85f33f153.tar.gz
spark-859f7249a614c86fc1691cc3116463f85f33f153.tar.bz2
spark-859f7249a614c86fc1691cc3116463f85f33f153.zip
[SPARK-4001][MLlib] adding parallel FP-Growth algorithm for frequent pattern mining in MLlib
Apriori is the classic algorithm for frequent item set mining in a transactional data set. It will be useful if Apriori algorithm is added to MLLib in Spark. This PR add an implementation for it. There is a point I am not sure wether it is most efficient. In order to filter out the eligible frequent item set, currently I am using a cartesian operation on two RDDs to calculate the degree of support of each item set, not sure wether it is better to use broadcast variable to achieve the same. I will add an example to use this algorithm if requires Author: Jacky Li <jacky.likun@huawei.com> Author: Jacky Li <jackylk@users.noreply.github.com> Author: Xiangrui Meng <meng@databricks.com> Closes #2847 from jackylk/apriori and squashes the following commits: bee3093 [Jacky Li] Merge pull request #1 from mengxr/SPARK-4001 7e69725 [Xiangrui Meng] simplify FPTree and update FPGrowth ec21f7d [Jacky Li] fix scalastyle 93f3280 [Jacky Li] create FPTree class d110ab2 [Jacky Li] change test case to use MLlibTestSparkContext a6c5081 [Jacky Li] Add Parallel FPGrowth algorithm eb3e4ca [Jacky Li] add FPGrowth 03df2b6 [Jacky Li] refactory according to comments 7b77ad7 [Jacky Li] fix scalastyle check f68a0bd [Jacky Li] add 2 apriori implemenation and fp-growth implementation 889b33f [Jacky Li] modify per scalastyle check da2cba7 [Jacky Li] adding apriori algorithm for frequent item set mining in Spark
Diffstat (limited to 'pom.xml')
0 files changed, 0 insertions, 0 deletions