diff options
author | Joseph K. Bradley <joseph@databricks.com> | 2016-07-13 12:33:39 -0700 |
---|---|---|
committer | Joseph K. Bradley <joseph@databricks.com> | 2016-07-13 12:33:39 -0700 |
commit | 01f09b161217193b797c8c85969d17054c958615 (patch) | |
tree | 40d7d4f5932157f8e0f0c13228dd18063728d4d3 /python/pyspark/mllib/stat | |
parent | d8220c1e5e94abbdb9643672b918f0d748206db9 (diff) | |
download | spark-01f09b161217193b797c8c85969d17054c958615.tar.gz spark-01f09b161217193b797c8c85969d17054c958615.tar.bz2 spark-01f09b161217193b797c8c85969d17054c958615.zip |
[SPARK-14812][ML][MLLIB][PYTHON] Experimental, DeveloperApi annotation audit for ML
## What changes were proposed in this pull request?
General decisions to follow, except where noted:
* spark.mllib, pyspark.mllib: Remove all Experimental annotations. Leave DeveloperApi annotations alone.
* spark.ml, pyspark.ml
** Annotate Estimator-Model pairs of classes and companion objects the same way.
** For all algorithms marked Experimental with Since tag <= 1.6, remove Experimental annotation.
** For all algorithms marked Experimental with Since tag = 2.0, leave Experimental annotation.
* DeveloperApi annotations are left alone, except where noted.
* No changes to which types are sealed.
Exceptions where I am leaving items Experimental in spark.ml, pyspark.ml, mainly because the items are new:
* Model Summary classes
* MLWriter, MLReader, MLWritable, MLReadable
* Evaluator and subclasses: There is discussion of changes around evaluating multiple metrics at once for efficiency.
* RFormula: Its behavior may need to change slightly to match R in edge cases.
* AFTSurvivalRegression
* MultilayerPerceptronClassifier
DeveloperApi changes:
* ml.tree.Node, ml.tree.Split, and subclasses should no longer be DeveloperApi
## How was this patch tested?
N/A
Note to reviewers:
* spark.ml.clustering.LDA underwent significant changes (additional methods), so let me know if you want me to leave it Experimental.
* Be careful to check for cases where a class should no longer be Experimental but has an Experimental method, val, or other feature. I did not find such cases, but please verify.
Author: Joseph K. Bradley <joseph@databricks.com>
Closes #14147 from jkbradley/experimental-audit.
Diffstat (limited to 'python/pyspark/mllib/stat')
-rw-r--r-- | python/pyspark/mllib/stat/KernelDensity.py | 2 | ||||
-rw-r--r-- | python/pyspark/mllib/stat/_statistics.py | 4 |
2 files changed, 0 insertions, 6 deletions
diff --git a/python/pyspark/mllib/stat/KernelDensity.py b/python/pyspark/mllib/stat/KernelDensity.py index 7da921976d..3b1c5519bd 100644 --- a/python/pyspark/mllib/stat/KernelDensity.py +++ b/python/pyspark/mllib/stat/KernelDensity.py @@ -28,8 +28,6 @@ from pyspark.rdd import RDD class KernelDensity(object): """ - .. note:: Experimental - Estimate probability density at required points given a RDD of samples from the population. diff --git a/python/pyspark/mllib/stat/_statistics.py b/python/pyspark/mllib/stat/_statistics.py index b0a85240b2..67d5f0e44f 100644 --- a/python/pyspark/mllib/stat/_statistics.py +++ b/python/pyspark/mllib/stat/_statistics.py @@ -160,8 +160,6 @@ class Statistics(object): @ignore_unicode_prefix def chiSqTest(observed, expected=None): """ - .. note:: Experimental - If `observed` is Vector, conduct Pearson's chi-squared goodness of fit test of the observed data against the expected distribution, or againt the uniform distribution (by default), with each category @@ -246,8 +244,6 @@ class Statistics(object): @ignore_unicode_prefix def kolmogorovSmirnovTest(data, distName="norm", *params): """ - .. note:: Experimental - Performs the Kolmogorov-Smirnov (KS) test for data sampled from a continuous distribution. It tests the null hypothesis that the data is generated from a particular distribution. |