[SPARK-7140] [MLLIB] only scan the first 16 entries in Vector.hashCode - spark

diff options

author	Xiangrui Meng <meng@databricks.com>	2015-04-28 09:59:36 -0700
committer	Xiangrui Meng <meng@databricks.com>	2015-04-28 09:59:36 -0700
commit	b14cd2364932e504695bcc49486ffb4518fdf33d (patch)
tree	b2ddae86f122b2feba34f46f41bddc7e8cbc66d0 /conf/spark-env.sh.template
parent	6a827d5d1ec520f129e42c3818fe7d0d870dcbef (diff)
download	spark-b14cd2364932e504695bcc49486ffb4518fdf33d.tar.gz spark-b14cd2364932e504695bcc49486ffb4518fdf33d.tar.bz2 spark-b14cd2364932e504695bcc49486ffb4518fdf33d.zip

[SPARK-7140] [MLLIB] only scan the first 16 entries in Vector.hashCode

The Python SerDe calls `Object.hashCode`, which is very expensive for Vectors. It is not necessary to scan the whole vector, especially for large ones. In this PR, we only scan the first 16 nonzeros. srowen Author: Xiangrui Meng <meng@databricks.com> Closes #5697 from mengxr/SPARK-7140 and squashes the following commits: 2abc86d [Xiangrui Meng] typo 8fb7d74 [Xiangrui Meng] update impl 1ebad60 [Xiangrui Meng] only scan the first 16 nonzeros in Vector.hashCode

Diffstat (limited to 'conf/spark-env.sh.template')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: