aboutsummaryrefslogtreecommitdiff
path: root/sbin/spark-config.sh
diff options
context:
space:
mode:
authorDB Tsai <dbtsai@alpinenow.com>2014-12-02 11:40:43 +0800
committerXiangrui Meng <meng@databricks.com>2014-12-02 11:40:43 +0800
commit64f3175bf976f5a28e691cedc7a4b333709e0c58 (patch)
tree5e9f414bb51f79f7de184909c82fbc7c90e5d2ae /sbin/spark-config.sh
parentb0a46d899541ec17db090aac6f9ea1b287ee9331 (diff)
downloadspark-64f3175bf976f5a28e691cedc7a4b333709e0c58.tar.gz
spark-64f3175bf976f5a28e691cedc7a4b333709e0c58.tar.bz2
spark-64f3175bf976f5a28e691cedc7a4b333709e0c58.zip
[SPARK-4611][MLlib] Implement the efficient vector norm
The vector norm in breeze is implemented by `activeIterator` which is known to be very slow. In this PR, an efficient vector norm is implemented, and with this API, `Normalizer` and `k-means` have big performance improvement. Here is the benchmark against mnist8m dataset. a) `Normalizer` Before DenseVector: 68.25secs SparseVector: 17.01secs With this PR DenseVector: 12.71secs SparseVector: 2.73secs b) `k-means` Before DenseVector: 83.46secs SparseVector: 61.60secs With this PR DenseVector: 70.04secs SparseVector: 59.05secs Author: DB Tsai <dbtsai@alpinenow.com> Closes #3462 from dbtsai/norm and squashes the following commits: 63c7165 [DB Tsai] typo 0c3637f [DB Tsai] add import org.apache.spark.SparkContext._ back 6fa616c [DB Tsai] address feedback 9b7cb56 [DB Tsai] move norm to static method 0b632e6 [DB Tsai] kmeans dbed124 [DB Tsai] style c1a877c [DB Tsai] first commit
Diffstat (limited to 'sbin/spark-config.sh')
0 files changed, 0 insertions, 0 deletions