diff options
author | zero323 <matthew.szymkiewicz@gmail.com> | 2015-10-16 15:53:26 -0700 |
---|---|---|
committer | Joseph K. Bradley <joseph@databricks.com> | 2015-10-16 15:53:26 -0700 |
commit | 8ac71d62d976bbfd0159cac6816dd8fa580ae1cb (patch) | |
tree | 18740525525d04abee55c3173956dd5970365a23 /python/pyspark/mllib/linalg/__init__.py | |
parent | 10046ea76cf8f0d08fe7ef548e4dbec69d9c73b8 (diff) | |
download | spark-8ac71d62d976bbfd0159cac6816dd8fa580ae1cb.tar.gz spark-8ac71d62d976bbfd0159cac6816dd8fa580ae1cb.tar.bz2 spark-8ac71d62d976bbfd0159cac6816dd8fa580ae1cb.zip |
[SPARK-11084] [ML] [PYTHON] Check if index can contain non-zero value before binary search
At this moment `SparseVector.__getitem__` executes `np.searchsorted` first and checks if result is in an expected range after that. It is possible to check if index can contain non-zero value before executing `np.searchsorted`.
Author: zero323 <matthew.szymkiewicz@gmail.com>
Closes #9098 from zero323/sparse_vector_getitem_improved.
Diffstat (limited to 'python/pyspark/mllib/linalg/__init__.py')
-rw-r--r-- | python/pyspark/mllib/linalg/__init__.py | 4 |
1 files changed, 2 insertions, 2 deletions
diff --git a/python/pyspark/mllib/linalg/__init__.py b/python/pyspark/mllib/linalg/__init__.py index 5276eb41cf..ae9ce58450 100644 --- a/python/pyspark/mllib/linalg/__init__.py +++ b/python/pyspark/mllib/linalg/__init__.py @@ -770,10 +770,10 @@ class SparseVector(Vector): if index < 0: index += self.size - insert_index = np.searchsorted(inds, index) - if insert_index >= inds.size: + if (inds.size == 0) or (index > inds.item(-1)): return 0. + insert_index = np.searchsorted(inds, index) row_ind = inds[insert_index] if row_ind == index: return vals[insert_index] |