diff options
author | zhangjiajin <zhangjiajin@huawei.com> | 2015-07-30 08:14:09 -0700 |
---|---|---|
committer | Xiangrui Meng <meng@databricks.com> | 2015-07-30 08:14:09 -0700 |
commit | d212a314227dec26c0dbec8ed3422d0ec8f818f9 (patch) | |
tree | 32775371b13cab56481318e6133bb6e136e63ad0 /streaming | |
parent | c5815930be46a89469440b7c61b59764fb67a54c (diff) | |
download | spark-d212a314227dec26c0dbec8ed3422d0ec8f818f9.tar.gz spark-d212a314227dec26c0dbec8ed3422d0ec8f818f9.tar.bz2 spark-d212a314227dec26c0dbec8ed3422d0ec8f818f9.zip |
[SPARK-8998] [MLLIB] Distribute PrefixSpan computation for large projected databases
Continuation of work by zhangjiajin
Closes #7412
Author: zhangjiajin <zhangjiajin@huawei.com>
Author: Feynman Liang <fliang@databricks.com>
Author: zhang jiajin <zhangjiajin@huawei.com>
Closes #7783 from feynmanliang/SPARK-8998-improve-distributed and squashes the following commits:
a61943d [Feynman Liang] Collect small patterns to local
4ddf479 [Feynman Liang] Parallelize freqItemCounts
ad23aa9 [zhang jiajin] Merge pull request #1 from feynmanliang/SPARK-8998-collectBeforeLocal
87fa021 [Feynman Liang] Improve extend prefix readability
c2caa5c [Feynman Liang] Readability improvements and comments
1235cfc [Feynman Liang] Use Iterable[Array[_]] over Array[Array[_]] for database
da0091b [Feynman Liang] Use lists for prefixes to reuse data
cb2a4fc [Feynman Liang] Inline code for readability
01c9ae9 [Feynman Liang] Add getters
6e149fa [Feynman Liang] Fix splitPrefixSuffixPairs
64271b3 [zhangjiajin] Modified codes according to comments.
d2250b7 [zhangjiajin] remove minPatternsBeforeLocalProcessing, add maxSuffixesBeforeLocalProcessing.
b07e20c [zhangjiajin] Merge branch 'master' of https://github.com/apache/spark into CollectEnoughPrefixes
095aa3a [zhangjiajin] Modified the code according to the review comments.
baa2885 [zhangjiajin] Modified the code according to the review comments.
6560c69 [zhangjiajin] Add feature: Collect enough frequent prefixes before projection in PrefixeSpan
a8fde87 [zhangjiajin] Merge branch 'master' of https://github.com/apache/spark
4dd1c8a [zhangjiajin] initialize file before rebase.
078d410 [zhangjiajin] fix a scala style error.
22b0ef4 [zhangjiajin] Add feature: Collect enough frequent prefixes before projection in PrefixSpan.
ca9c4c8 [zhangjiajin] Modified the code according to the review comments.
574e56c [zhangjiajin] Add new object LocalPrefixSpan, and do some optimization.
ba5df34 [zhangjiajin] Fix a Scala style error.
4c60fb3 [zhangjiajin] Fix some Scala style errors.
1dd33ad [zhangjiajin] Modified the code according to the review comments.
89bc368 [zhangjiajin] Fixed a Scala style error.
a2eb14c [zhang jiajin] Delete PrefixspanSuite.scala
951fd42 [zhang jiajin] Delete Prefixspan.scala
575995f [zhangjiajin] Modified the code according to the review comments.
91fd7e6 [zhangjiajin] Add new algorithm PrefixSpan and test file.
Diffstat (limited to 'streaming')
0 files changed, 0 insertions, 0 deletions