diff options
author | Joseph K. Bradley <joseph@databricks.com> | 2016-03-22 12:11:23 -0700 |
---|---|---|
committer | Xiangrui Meng <meng@databricks.com> | 2016-03-22 12:11:37 -0700 |
commit | 7e3423b9c03c9812d404134c3d204c4cfea87721 (patch) | |
tree | b922610e318774c1db7da6549ee0932b21fe3090 /python/pyspark/ml/clustering.py | |
parent | 297c20226d3330309c9165d789749458f8f4ab8e (diff) | |
download | spark-7e3423b9c03c9812d404134c3d204c4cfea87721.tar.gz spark-7e3423b9c03c9812d404134c3d204c4cfea87721.tar.bz2 spark-7e3423b9c03c9812d404134c3d204c4cfea87721.zip |
[SPARK-13951][ML][PYTHON] Nested Pipeline persistence
Adds support for saving and loading nested ML Pipelines from Python. Pipeline and PipelineModel do not extend JavaWrapper, but they are able to utilize the JavaMLWriter, JavaMLReader implementations.
Also:
* Separates out interfaces from Java wrapper implementations for MLWritable, MLReadable, MLWriter, MLReader.
* Moves methods _stages_java2py, _stages_py2java into Pipeline, PipelineModel as _transfer_stage_from_java, _transfer_stage_to_java
Added new unit test for nested Pipelines. Abstracted validity check into a helper method for the 2 unit tests.
Author: Joseph K. Bradley <joseph@databricks.com>
Closes #11866 from jkbradley/nested-pipeline-io.
Closes #11835
Diffstat (limited to 'python/pyspark/ml/clustering.py')
-rw-r--r-- | python/pyspark/ml/clustering.py | 4 |
1 files changed, 2 insertions, 2 deletions
diff --git a/python/pyspark/ml/clustering.py b/python/pyspark/ml/clustering.py index 1cea477acb..2db5b82c44 100644 --- a/python/pyspark/ml/clustering.py +++ b/python/pyspark/ml/clustering.py @@ -25,7 +25,7 @@ __all__ = ['BisectingKMeans', 'BisectingKMeansModel', 'KMeans', 'KMeansModel'] -class KMeansModel(JavaModel, MLWritable, MLReadable): +class KMeansModel(JavaModel, JavaMLWritable, JavaMLReadable): """ Model fitted by KMeans. @@ -48,7 +48,7 @@ class KMeansModel(JavaModel, MLWritable, MLReadable): @inherit_doc class KMeans(JavaEstimator, HasFeaturesCol, HasPredictionCol, HasMaxIter, HasTol, HasSeed, - MLWritable, MLReadable): + JavaMLWritable, JavaMLReadable): """ K-means clustering with support for multiple parallel runs and a k-means++ like initialization mode (the k-means|| algorithm by Bahmani et al). When multiple concurrent runs are requested, |