diff options
author | Bryan Cutler <cutlerb@gmail.com> | 2016-07-14 09:12:46 +0100 |
---|---|---|
committer | Sean Owen <sowen@cloudera.com> | 2016-07-14 09:12:46 +0100 |
commit | e3f8a033679261aaee15bda0f970a1890411e743 (patch) | |
tree | fecc6121b1d5357c2214f710018de2a9ddea2786 /examples/src/main/python/ml/vector_assembler_example.py | |
parent | 252d4f27f23b547777892bcea25a2cea62d8cbab (diff) | |
download | spark-e3f8a033679261aaee15bda0f970a1890411e743.tar.gz spark-e3f8a033679261aaee15bda0f970a1890411e743.tar.bz2 spark-e3f8a033679261aaee15bda0f970a1890411e743.zip |
[SPARK-16403][EXAMPLES] Cleanup to remove unused imports, consistent style, minor fixes
## What changes were proposed in this pull request?
Cleanup of examples, mostly from PySpark-ML to fix minor issues: unused imports, style consistency, pipeline_example is a duplicate, use future print funciton, and a spelling error.
* The "Pipeline Example" is duplicated by "Simple Text Classification Pipeline" in Scala, Python, and Java.
* "Estimator Transformer Param Example" is duplicated by "Simple Params Example" in Scala, Python and Java
* Synced random_forest_classifier_example.py with Scala by adding IndexToString label converted
* Synced train_validation_split.py (in Scala ModelSelectionViaTrainValidationExample) by adjusting data split, adding grid for intercept.
* RegexTokenizer was doing nothing in tokenizer_example.py and JavaTokenizerExample.java, synced with Scala version
## How was this patch tested?
local tests and run modified examples
Author: Bryan Cutler <cutlerb@gmail.com>
Closes #14081 from BryanCutler/examples-cleanup-SPARK-16403.
Diffstat (limited to 'examples/src/main/python/ml/vector_assembler_example.py')
-rw-r--r-- | examples/src/main/python/ml/vector_assembler_example.py | 2 |
1 files changed, 2 insertions, 0 deletions
diff --git a/examples/src/main/python/ml/vector_assembler_example.py b/examples/src/main/python/ml/vector_assembler_example.py index bbfc316ff2..eac33711ad 100644 --- a/examples/src/main/python/ml/vector_assembler_example.py +++ b/examples/src/main/python/ml/vector_assembler_example.py @@ -33,9 +33,11 @@ if __name__ == "__main__": dataset = spark.createDataFrame( [(0, 18, 1.0, Vectors.dense([0.0, 10.0, 0.5]), 1.0)], ["id", "hour", "mobile", "userFeatures", "clicked"]) + assembler = VectorAssembler( inputCols=["hour", "mobile", "userFeatures"], outputCol="features") + output = assembler.transform(dataset) print(output.select("features", "clicked").first()) # $example off$ |