diff options
author | Reza Zadeh <reza@databricks.com> | 2015-04-06 13:15:01 -0700 |
---|---|---|
committer | Xiangrui Meng <meng@databricks.com> | 2015-04-06 13:15:01 -0700 |
commit | 30363ede8635f2548e444697dbcf60a795b61a84 (patch) | |
tree | b3ee41a5b9dd3dcceec93c89f5db3897cab62d39 /docker/spark-test/README.md | |
parent | 9fe41252198df71f4629843d363db8c83f36440c (diff) | |
download | spark-30363ede8635f2548e444697dbcf60a795b61a84.tar.gz spark-30363ede8635f2548e444697dbcf60a795b61a84.tar.bz2 spark-30363ede8635f2548e444697dbcf60a795b61a84.zip |
[MLlib] [SPARK-6713] Iterators in columnSimilarities for mapPartitionsWithIndex
Use Iterators in columnSimilarities to allow mapPartitionsWithIndex to spill to disk. This could happen in a dense and large column - this way Spark can spill the pairs onto disk instead of building all the pairs before handing them to Spark.
Another PR coming to update documentation.
Author: Reza Zadeh <reza@databricks.com>
Closes #5364 from rezazadeh/optmemsim and squashes the following commits:
47c90ba [Reza Zadeh] Iterators in columnSimilarities for flatMap
Diffstat (limited to 'docker/spark-test/README.md')
0 files changed, 0 insertions, 0 deletions