[SPARK-11559][MLLIB] Make `runs` no effect in mllib.KMeans

## What changes were proposed in this pull request? We deprecated ```runs``` of mllib.KMeans in Spark 1.6 (SPARK-11358). In 2.0, we will make it no effect (with warning messages). We did not remove ```setRuns/getRuns``` for better binary compatibility. This PR change `runs` which are appeared at the public API. Usage inside of ```KMeans.runAlgorithm()``` will be resolved at #10806. ## How was this patch tested? Existing unit tests. cc jkbradley Author: Yanbo Liang <ybliang8@gmail.com> Closes #12608 from yanboliang/spark-11559.
author: Yanbo Liang <ybliang8@gmail.com> 2016-04-26 11:55:21 -0700
committer: Joseph K. Bradley <joseph@databricks.com> 2016-04-26 11:55:21 -0700
commit: 302a18686998b8b96546526bfccec9cf5b667386 (patch)
tree: aeea4d75fd873d030892aec5407137bc40e5a871 /python
parent: 2a3d39f48b1a7bb462e17e80e243bbc0a94d802e (diff)
download: spark-302a18686998b8b96546526bfccec9cf5b667386.tar.gz
spark-302a18686998b8b96546526bfccec9cf5b667386.tar.bz2
spark-302a18686998b8b96546526bfccec9cf5b667386.zip
2 files changed, 5 insertions, 9 deletions
diff --git a/python/pyspark/ml/clustering.py b/python/pyspark/ml/clustering.py
index 4ce8012754..9740ec45af 100644
--- a/python/pyspark/ml/clustering.py
+++ b/python/pyspark/ml/clustering.py
@@ -194,9 +194,8 @@ class KMeansModel(JavaModel, JavaMLWritable, JavaMLReadable):
 class KMeans(JavaEstimator, HasFeaturesCol, HasPredictionCol, HasMaxIter, HasTol, HasSeed,
              JavaMLWritable, JavaMLReadable):
     """
-    K-means clustering with support for multiple parallel runs and a k-means++ like initialization
-    mode (the k-means|| algorithm by Bahmani et al). When multiple concurrent runs are requested,
-    they are executed together with joint passes over the data for efficiency.
+    K-means clustering with a k-means++ like initialization mode
+    (the k-means|| algorithm by Bahmani et al).
 
     >>> from pyspark.mllib.linalg import Vectors
     >>> data = [(Vectors.dense([0.0, 0.0]),), (Vectors.dense([1.0, 1.0]),),
diff --git a/python/pyspark/mllib/clustering.py b/python/pyspark/mllib/clustering.py
index 23d118bd40..95f7278dc6 100644
--- a/python/pyspark/mllib/clustering.py
+++ b/python/pyspark/mllib/clustering.py
@@ -179,7 +179,7 @@ class KMeansModel(Saveable, Loader):
 
     >>> data = array([0.0,0.0, 1.0,1.0, 9.0,8.0, 8.0,9.0]).reshape(4, 2)
     >>> model = KMeans.train(
-    ...     sc.parallelize(data), 2, maxIterations=10, runs=30, initializationMode="random",
+    ...     sc.parallelize(data), 2, maxIterations=10, initializationMode="random",
     ...                    seed=50, initializationSteps=5, epsilon=1e-4)
     >>> model.predict(array([0.0, 0.0])) == model.predict(array([1.0, 1.0]))
     True
@@ -323,9 +323,7 @@ class KMeans(object):
           Maximum number of iterations allowed.
           (default: 100)
         :param runs:
-          Number of runs to execute in parallel. The best model according
-          to the cost function will be returned (deprecated in 1.6.0).
-          (default: 1)
+          This param has no effect since Spark 2.0.0.
         :param initializationMode:
           The initialization algorithm. This can be either "random" or
           "k-means||".
@@ -350,8 +348,7 @@ class KMeans(object):
           (default: None)
         """
         if runs != 1:
-            warnings.warn(
-                "Support for runs is deprecated in 1.6.0. This param will have no effect in 2.0.0.")
+            warnings.warn("The param `runs` has no effect since Spark 2.0.0.")
         clusterInitialModel = []
         if initialModel is not None:
             if not isinstance(initialModel, KMeansModel):
author	Yanbo Liang <ybliang8@gmail.com>	2016-04-26 11:55:21 -0700
committer	Joseph K. Bradley <joseph@databricks.com>	2016-04-26 11:55:21 -0700
commit	302a18686998b8b96546526bfccec9cf5b667386 (patch)
tree	aeea4d75fd873d030892aec5407137bc40e5a871 /python
parent	2a3d39f48b1a7bb462e17e80e243bbc0a94d802e (diff)
download	spark-302a18686998b8b96546526bfccec9cf5b667386.tar.gz spark-302a18686998b8b96546526bfccec9cf5b667386.tar.bz2 spark-302a18686998b8b96546526bfccec9cf5b667386.zip