[SPARK-14812][ML][MLLIB][PYTHON] Experimental, DeveloperApi annotation audit for ML

## What changes were proposed in this pull request? General decisions to follow, except where noted: * spark.mllib, pyspark.mllib: Remove all Experimental annotations. Leave DeveloperApi annotations alone. * spark.ml, pyspark.ml ** Annotate Estimator-Model pairs of classes and companion objects the same way. ** For all algorithms marked Experimental with Since tag <= 1.6, remove Experimental annotation. ** For all algorithms marked Experimental with Since tag = 2.0, leave Experimental annotation. * DeveloperApi annotations are left alone, except where noted. * No changes to which types are sealed. Exceptions where I am leaving items Experimental in spark.ml, pyspark.ml, mainly because the items are new: * Model Summary classes * MLWriter, MLReader, MLWritable, MLReadable * Evaluator and subclasses: There is discussion of changes around evaluating multiple metrics at once for efficiency. * RFormula: Its behavior may need to change slightly to match R in edge cases. * AFTSurvivalRegression * MultilayerPerceptronClassifier DeveloperApi changes: * ml.tree.Node, ml.tree.Split, and subclasses should no longer be DeveloperApi ## How was this patch tested? N/A Note to reviewers: * spark.ml.clustering.LDA underwent significant changes (additional methods), so let me know if you want me to leave it Experimental. * Be careful to check for cases where a class should no longer be Experimental but has an Experimental method, val, or other feature. I did not find such cases, but please verify. Author: Joseph K. Bradley <joseph@databricks.com> Closes #14147 from jkbradley/experimental-audit.
author: Joseph K. Bradley <joseph@databricks.com> 2016-07-13 12:33:39 -0700
committer: Joseph K. Bradley <joseph@databricks.com> 2016-07-13 12:33:39 -0700
commit: 01f09b161217193b797c8c85969d17054c958615 (patch)
tree: 40d7d4f5932157f8e0f0c13228dd18063728d4d3 /python/pyspark/mllib/clustering.py
parent: d8220c1e5e94abbdb9643672b918f0d748206db9 (diff)
download: spark-01f09b161217193b797c8c85969d17054c958615.tar.gz
spark-01f09b161217193b797c8c85969d17054c958615.tar.bz2
spark-01f09b161217193b797c8c85969d17054c958615.zip
1 files changed, 0 insertions, 16 deletions
diff --git a/python/pyspark/mllib/clustering.py b/python/pyspark/mllib/clustering.py
index c38c543972..c8c3c42774 100644
--- a/python/pyspark/mllib/clustering.py
+++ b/python/pyspark/mllib/clustering.py
@@ -47,8 +47,6 @@ __all__ = ['BisectingKMeansModel', 'BisectingKMeans', 'KMeansModel', 'KMeans',
 @inherit_doc
 class BisectingKMeansModel(JavaModelWrapper):
     """
-    .. note:: Experimental
-
     A clustering model derived from the bisecting k-means method.
 
     >>> data = array([0.0,0.0, 1.0,1.0, 9.0,8.0, 8.0,9.0]).reshape(4, 2)
@@ -120,8 +118,6 @@ class BisectingKMeansModel(JavaModelWrapper):
 
 class BisectingKMeans(object):
     """
-    .. note:: Experimental
-
     A bisecting k-means algorithm based on the paper "A comparison of
     document clustering techniques" by Steinbach, Karypis, and Kumar,
     with modification to fit Spark.
@@ -366,8 +362,6 @@ class KMeans(object):
 class GaussianMixtureModel(JavaModelWrapper, JavaSaveable, JavaLoader):
 
     """
-    .. note:: Experimental
-
     A clustering model derived from the Gaussian Mixture Model method.
 
     >>> from pyspark.mllib.linalg import Vectors, DenseMatrix
@@ -513,8 +507,6 @@ class GaussianMixtureModel(JavaModelWrapper, JavaSaveable, JavaLoader):
 
 class GaussianMixture(object):
     """
-    .. note:: Experimental
-
     Learning algorithm for Gaussian Mixtures using the expectation-maximization algorithm.
 
     .. versionadded:: 1.3.0
@@ -565,8 +557,6 @@ class GaussianMixture(object):
 class PowerIterationClusteringModel(JavaModelWrapper, JavaSaveable, JavaLoader):
 
     """
-    .. note:: Experimental
-
     Model produced by [[PowerIterationClustering]].
 
     >>> import math
@@ -645,8 +635,6 @@ class PowerIterationClusteringModel(JavaModelWrapper, JavaSaveable, JavaLoader):
 
 class PowerIterationClustering(object):
     """
-    .. note:: Experimental
-
     Power Iteration Clustering (PIC), a scalable graph clustering algorithm
     developed by [[http://www.icml2010.org/papers/387.pdf Lin and Cohen]].
     From the abstract: PIC finds a very low-dimensional embedding of a
@@ -693,8 +681,6 @@ class PowerIterationClustering(object):
 
 class StreamingKMeansModel(KMeansModel):
     """
-    .. note:: Experimental
-
     Clustering model which can perform an online update of the centroids.
 
     The update formula for each centroid is given by
@@ -794,8 +780,6 @@ class StreamingKMeansModel(KMeansModel):
 
 class StreamingKMeans(object):
     """
-    .. note:: Experimental
-
     Provides methods to set k, decayFactor, timeUnit to configure the
     KMeans algorithm for fitting and predicting on incoming dstreams.
     More details on how the centroids are updated are provided under the
author	Joseph K. Bradley <joseph@databricks.com>	2016-07-13 12:33:39 -0700
committer	Joseph K. Bradley <joseph@databricks.com>	2016-07-13 12:33:39 -0700
commit	01f09b161217193b797c8c85969d17054c958615 (patch)
tree	40d7d4f5932157f8e0f0c13228dd18063728d4d3 /python/pyspark/mllib/clustering.py
parent	d8220c1e5e94abbdb9643672b918f0d748206db9 (diff)
download	spark-01f09b161217193b797c8c85969d17054c958615.tar.gz spark-01f09b161217193b797c8c85969d17054c958615.tar.bz2 spark-01f09b161217193b797c8c85969d17054c958615.zip