aboutsummaryrefslogtreecommitdiff
path: root/python/pyspark/mllib/fpm.py
diff options
context:
space:
mode:
authororaviv <oraviv@paypal.com>2016-07-13 14:47:08 +0100
committerSean Owen <sowen@cloudera.com>2016-07-13 14:47:08 +0100
commitea06e4ef34c860219a9aeec81816ef53ada96253 (patch)
tree32fe745a7941c76a6044d12933dac5c6a4772cdf /python/pyspark/mllib/fpm.py
parent51ade51a9fd64fc2fe651c505a286e6f29f59d40 (diff)
downloadspark-ea06e4ef34c860219a9aeec81816ef53ada96253.tar.gz
spark-ea06e4ef34c860219a9aeec81816ef53ada96253.tar.bz2
spark-ea06e4ef34c860219a9aeec81816ef53ada96253.zip
[SPARK-16469] enhanced simulate multiply
## What changes were proposed in this pull request? We have a use case of multiplying very big sparse matrices. we have about 1000x1000 distributed block matrices multiplication and the simulate multiply goes like O(n^4) (n being 1000). it takes about 1.5 hours. We modified it slightly with classical hashmap and now run in about 30 seconds O(n^2). ## How was this patch tested? We have added a performance test and verified the reduced time. Author: oraviv <oraviv@paypal.com> Closes #14068 from uzadude/master.
Diffstat (limited to 'python/pyspark/mllib/fpm.py')
0 files changed, 0 insertions, 0 deletions