diff options
author | Xiangrui Meng <meng@databricks.com> | 2015-08-12 23:04:59 -0700 |
---|---|---|
committer | Xiangrui Meng <meng@databricks.com> | 2015-08-12 23:04:59 -0700 |
commit | 68f99571492f67596b3656e9f076deeb96616f4a (patch) | |
tree | be2248de6c66c50ce13561c67c18af7e3ae86f2e /python/pyspark/profiler.py | |
parent | d0b18919d16e6a2f19159516bd2767b60b595279 (diff) | |
download | spark-68f99571492f67596b3656e9f076deeb96616f4a.tar.gz spark-68f99571492f67596b3656e9f076deeb96616f4a.tar.bz2 spark-68f99571492f67596b3656e9f076deeb96616f4a.zip |
[SPARK-9918] [MLLIB] remove runs from k-means and rename epsilon to tol
This requires some discussion. I'm not sure whether `runs` is a useful parameter. It certainly complicates the implementation. We might want to optimize the k-means implementation with block matrix operations. In this case, having `runs` may not be worth the trade-off. Also it increases the communication cost in a single job, which might cause other issues.
This PR also renames `epsilon` to `tol` to have consistent naming among algorithms. The Python constructor is updated to include all parameters.
jkbradley yu-iskw
Author: Xiangrui Meng <meng@databricks.com>
Closes #8148 from mengxr/SPARK-9918 and squashes the following commits:
149b9e5 [Xiangrui Meng] fix constructor in Python and rename epsilon to tol
3cc15b3 [Xiangrui Meng] fix test and change initStep to initSteps in python
a0a0274 [Xiangrui Meng] remove runs from k-means in the pipeline API
Diffstat (limited to 'python/pyspark/profiler.py')
0 files changed, 0 insertions, 0 deletions