| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
| |
New user guide section ml-decision-tree.md, including code examples.
I have run all examples, including the Java ones.
CC: manishamde yanboliang mengxr
Author: Joseph K. Bradley <joseph@databricks.com>
Closes #8244 from jkbradley/ml-dt-docs.
|
|
|
|
|
|
|
|
|
| |
By using `StringIndexer`, we can obtain indexed label on new column. So a following estimator should use this new column through pipeline if it wants to use string indexed label.
I think it is better to make it explicit on documentation.
Author: lewuathe <lewuathe@me.com>
Closes #8205 from Lewuathe/SPARK-9977.
|
|
|
|
|
|
|
|
|
|
|
|
| |
`Lists.newArrayList` -> `Arrays.asList`
CC jkbradley feynmanliang
Anybody into replacing usages of `Lists.newArrayList` in the examples / source code too? this method isn't useful in Java 7 and beyond.
Author: Sean Owen <sowen@cloudera.com>
Closes #8272 from srowen/SPARK-10070.
|
|
|
|
|
|
|
|
| |
mengxr jkbradley
Author: Feynman Liang <fliang@databricks.com>
Closes #8184 from feynmanliang/SPARK-9889-DCT-docs.
|
|
|
|
|
|
|
|
|
|
| |
ml.feature.ElementwiseProduct
Add Python API, user guide and example for ml.feature.ElementwiseProduct.
Author: Yanbo Liang <ybliang8@gmail.com>
Closes #8061 from yanboliang/SPARK-9768.
|
|
|
|
|
|
|
|
|
|
| |
jira: https://issues.apache.org/jira/browse/SPARK-7583
User guide update for RegexTokenizer
Author: Yuhao Yang <hhbyyh@gmail.com>
Closes #7828 from hhbyyh/regexTokenizerDoc.
|
|
|
|
|
|
|
|
|
|
|
| |
Add ml.PCA user guide document and code examples for Scala/Java/Python.
Author: Yanbo Liang <ybliang8@gmail.com>
Closes #7522 from yanboliang/ml-pca-md and squashes the following commits:
60dec05 [Yanbo Liang] address comments
f992abe [Yanbo Liang] Add ml.PCA doc and examples
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add documentation for NGram feature transformer.
Author: Feynman Liang <fliang@databricks.com>
Closes #7244 from feynmanliang/SPARK-8457 and squashes the following commits:
5aface9 [Feynman Liang] Pretty print Scala output and add API doc to each codetab
60d5ac0 [Feynman Liang] Inline API doc and fix indentation
736ccbc [Feynman Liang] NGram feature transformer documentation
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This PR adds a Java unit test and user guide for `StringIndexer`. I put it before `OneHotEncoder` because they are closely related. jkbradley
Author: Xiangrui Meng <meng@databricks.com>
Closes #6561 from mengxr/SPARK-7582 and squashes the following commits:
4bba4f1 [Xiangrui Meng] fix example
ba1cd1b [Xiangrui Meng] fix style
7fa18d1 [Xiangrui Meng] add user guide for StringIndexer
136cb93 [Xiangrui Meng] add a Java unit test for StringIndexer
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This PR adds a section in the user guide for `VectorAssembler` with code examples in Python/Java/Scala. It also adds a unit test in Java.
jkbradley
Author: Xiangrui Meng <meng@databricks.com>
Closes #6556 from mengxr/SPARK-7584 and squashes the following commits:
11313f6 [Xiangrui Meng] simplify Java example
0cd47f3 [Xiangrui Meng] update user guide
fd36292 [Xiangrui Meng] update Java unit test
ce61ca0 [Xiangrui Meng] add Java unit test for VectorAssembler
e399942 [Xiangrui Meng] scala/python example code
|
|
|
|
|
|
|
|
|
| |
Author: Octavian Geagla <ogeagla@gmail.com>
Closes #6501 from ogeagla/ml-guide-elemwiseprod and squashes the following commits:
4ad93d5 [Octavian Geagla] [SPARK-7576] [MLLIB] Incorporate code review feedback.
f7be7ad [Octavian Geagla] [SPARK-7576] [MLLIB] Add spark.ml user guide doc/example for ElementwiseProduct.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
CC jkbradley
Author: Xusen Yin <yinxusen@gmail.com>
Closes #6451 from yinxusen/SPARK-7577 and squashes the following commits:
e2dc32e [Xusen Yin] rename colums
e350e49 [Xusen Yin] add all demos
006ddf1 [Xusen Yin] add java test
3238481 [Xusen Yin] add bucketizer
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Added user guide sections with code examples.
Also added small Java unit tests to test Java example in guide.
CC: mengxr
Author: Joseph K. Bradley <joseph@databricks.com>
Closes #6127 from jkbradley/feature-guide-2 and squashes the following commits:
cd47f4b [Joseph K. Bradley] Updated based on code review
f16bcec [Joseph K. Bradley] Fixed merge issues and update Python examples print calls for Python 3
0a862f9 [Joseph K. Bradley] Added Normalizer, StandardScaler to ml-features doc, plus small Java unit tests
a21c2d6 [Joseph K. Bradley] Updated ml-features.md with IDF
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Added VectorIndexer section to ML user guide. Also added javaCategoryMaps() method and Java unit test for it.
CC: mengxr
Author: Joseph K. Bradley <joseph@databricks.com>
Closes #6255 from jkbradley/vector-indexer-guide and squashes the following commits:
dbb8c4c [Joseph K. Bradley] simplified VectorIndexerModel.javaCategoryMaps
f692084 [Joseph K. Bradley] Added VectorIndexer section to ML user guide. Also added javaCategoryMaps() method and Java unit test for it.
|
|
|
|
|
|
|
|
| |
Author: Sandy Ryza <sandy@cloudera.com>
Closes #6126 from sryza/sandy-spark-7579 and squashes the following commits:
5af803d [Sandy Ryza] SPARK-7579 [MLLIB] User guide update for OneHotEncoder
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
CC jkbradley.
JIRA [issue](https://issues.apache.org/jira/browse/SPARK-7586).
Author: Xusen Yin <yinxusen@gmail.com>
Closes #6181 from yinxusen/SPARK-7586 and squashes the following commits:
77014c5 [Xusen Yin] comment fix
57a4c07 [Xusen Yin] small fix for docs
1178c8f [Xusen Yin] remove the correctness check in java suite
1c3f389 [Xusen Yin] delete sbt commit
1af152b [Xusen Yin] check python example code
1b5369e [Xusen Yin] add docs of word2vec
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
JIRA [here](https://issues.apache.org/jira/browse/SPARK-7581).
CC jkbradley
Author: Xusen Yin <yinxusen@gmail.com>
Closes #6113 from yinxusen/SPARK-7581 and squashes the following commits:
1a7d80d [Xusen Yin] merge with master
892a8e9 [Xusen Yin] fix python 3 compatibility
ec935bf [Xusen Yin] small fix
3e9fa1d [Xusen Yin] delete note
69fcf85 [Xusen Yin] simplify and add python example
81d21dc [Xusen Yin] add programming guide for Polynomial Expansion
40babfb [Xusen Yin] add java test suite for PolynomialExpansion
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Scala, Java and Python examples
JIRA: https://issues.apache.org/jira/browse/SPARK-7556
Author: Liang-Chi Hsieh <viirya@gmail.com>
Closes #6116 from viirya/binarizer_doc and squashes the following commits:
40cb677 [Liang-Chi Hsieh] Better print out.
5b7ef1d [Liang-Chi Hsieh] Make examples more clear.
1bf9c09 [Liang-Chi Hsieh] For comments.
6cf8cba [Liang-Chi Hsieh] Add user guide for Binarizer.
|
|
Added feature transformer subsection to spark.ml guide, with HashingTF and Tokenizer. Added JavaHashingTFSuite to test Java examples in new guide.
I've run Scala, Python examples in the Spark/PySpark shells. I ran the Java examples via the test suite (with small modifications for printing).
CC: mengxr
Author: Joseph K. Bradley <joseph@databricks.com>
Closes #6093 from jkbradley/hashingtf-guide and squashes the following commits:
d5d213f [Joseph K. Bradley] small fix
dd6e91a [Joseph K. Bradley] fixes from code review of user guide
33c3ff9 [Joseph K. Bradley] small fix
bc6058c [Joseph K. Bradley] fix link
361a174 [Joseph K. Bradley] Added subsection for feature transformers to spark.ml guide, with HashingTF and Tokenizer. Added JavaHashingTFSuite to test Java examples in new guide
|