aboutsummaryrefslogtreecommitdiff
path: root/mllib
diff options
context:
space:
mode:
authorYanbo Liang <ybliang8@gmail.com>2015-05-08 15:48:39 -0700
committerJoseph K. Bradley <joseph@databricks.com>2015-05-08 15:48:39 -0700
commit35c9599b94de759204ed33cdd46d8ee108bccd86 (patch)
tree4e2acabba806470d73370105a52682cd35ec0628 /mllib
parent6dad76e5eba3c2925bfc9d142f31f7c2dc649886 (diff)
downloadspark-35c9599b94de759204ed33cdd46d8ee108bccd86.tar.gz
spark-35c9599b94de759204ed33cdd46d8ee108bccd86.tar.bz2
spark-35c9599b94de759204ed33cdd46d8ee108bccd86.zip
[SPARK-5913] [MLLIB] Python API for ChiSqSelector
Add a Python API for mllib.feature.ChiSqSelector https://issues.apache.org/jira/browse/SPARK-5913 Author: Yanbo Liang <ybliang8@gmail.com> Closes #5939 from yanboliang/spark-5913 and squashes the following commits: cdaac99 [Yanbo Liang] Python API for ChiSqSelector
Diffstat (limited to 'mllib')
-rw-r--r--mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala10
1 files changed, 10 insertions, 0 deletions
diff --git a/mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala b/mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala
index 426306d78c..8c30ad4b39 100644
--- a/mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala
+++ b/mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala
@@ -495,6 +495,16 @@ private[python] class PythonMLLibAPI extends Serializable {
}
/**
+ * Java stub for ChiSqSelector.fit(). This stub returns a
+ * handle to the Java object instead of the content of the Java object.
+ * Extra care needs to be taken in the Python code to ensure it gets freed on
+ * exit; see the Py4J documentation.
+ */
+ def fitChiSqSelector(numTopFeatures: Int, data: JavaRDD[LabeledPoint]): ChiSqSelectorModel = {
+ new ChiSqSelector(numTopFeatures).fit(data.rdd)
+ }
+
+ /**
* Java stub for IDF.fit(). This stub returns a
* handle to the Java object instead of the content of the Java object.
* Extra care needs to be taken in the Python code to ensure it gets freed on