diff options
author | chie8842 <hayashidac@nttdata.co.jp> | 2016-11-08 13:45:37 +0000 |
---|---|---|
committer | Sean Owen <sowen@cloudera.com> | 2016-11-08 13:45:37 +0000 |
commit | ee2e741ac16b01d9cae0eadd35af774547bbd415 (patch) | |
tree | 792d6d1460e93d7ab1e991d5df355df7c43c6819 /docs/ml-features.md | |
parent | c291bd2745a8a2e4ba91d8697879eb8da10287e2 (diff) | |
download | spark-ee2e741ac16b01d9cae0eadd35af774547bbd415.tar.gz spark-ee2e741ac16b01d9cae0eadd35af774547bbd415.tar.bz2 spark-ee2e741ac16b01d9cae0eadd35af774547bbd415.zip |
[SPARK-13770][DOCUMENTATION][ML] Document the ML feature Interaction
I created Scala and Java example and added documentation.
Author: chie8842 <hayashidac@nttdata.co.jp>
Closes #15658 from hayashidac/SPARK-13770.
Diffstat (limited to 'docs/ml-features.md')
-rw-r--r-- | docs/ml-features.md | 52 |
1 files changed, 52 insertions, 0 deletions
diff --git a/docs/ml-features.md b/docs/ml-features.md index 352887d3ba..903177210d 100644 --- a/docs/ml-features.md +++ b/docs/ml-features.md @@ -729,6 +729,58 @@ for more details on the API. </div> </div> +## Interaction + +`Interaction` is a `Transformer` which takes vector or double-valued columns, and generates a single vector column that contains the product of all combinations of one value from each input column. + +For example, if you have 2 vector type columns each of which has 3 dimensions as input columns, then then you'll get a 9-dimensional vector as the output column. + +**Examples** + +Assume that we have the following DataFrame with the columns "id1", "vec1", and "vec2": + +~~~~ + id1|vec1 |vec2 + ---|--------------|-------------- + 1 |[1.0,2.0,3.0] |[8.0,4.0,5.0] + 2 |[4.0,3.0,8.0] |[7.0,9.0,8.0] + 3 |[6.0,1.0,9.0] |[2.0,3.0,6.0] + 4 |[10.0,8.0,6.0]|[9.0,4.0,5.0] + 5 |[9.0,2.0,7.0] |[10.0,7.0,3.0] + 6 |[1.0,1.0,4.0] |[2.0,8.0,4.0] +~~~~ + +Applying `Interaction` with those input columns, +then `interactedCol` as the output column contains: + +~~~~ + id1|vec1 |vec2 |interactedCol + ---|--------------|--------------|------------------------------------------------------ + 1 |[1.0,2.0,3.0] |[8.0,4.0,5.0] |[8.0,4.0,5.0,16.0,8.0,10.0,24.0,12.0,15.0] + 2 |[4.0,3.0,8.0] |[7.0,9.0,8.0] |[56.0,72.0,64.0,42.0,54.0,48.0,112.0,144.0,128.0] + 3 |[6.0,1.0,9.0] |[2.0,3.0,6.0] |[36.0,54.0,108.0,6.0,9.0,18.0,54.0,81.0,162.0] + 4 |[10.0,8.0,6.0]|[9.0,4.0,5.0] |[360.0,160.0,200.0,288.0,128.0,160.0,216.0,96.0,120.0] + 5 |[9.0,2.0,7.0] |[10.0,7.0,3.0]|[450.0,315.0,135.0,100.0,70.0,30.0,350.0,245.0,105.0] + 6 |[1.0,1.0,4.0] |[2.0,8.0,4.0] |[12.0,48.0,24.0,12.0,48.0,24.0,48.0,192.0,96.0] +~~~~ + +<div class="codetabs"> +<div data-lang="scala" markdown="1"> + +Refer to the [Interaction Scala docs](api/scala/index.html#org.apache.spark.ml.feature.Interaction) +for more details on the API. + +{% include_example scala/org/apache/spark/examples/ml/InteractionExample.scala %} +</div> + +<div data-lang="java" markdown="1"> + +Refer to the [Interaction Java docs](api/java/org/apache/spark/ml/feature/Interaction.html) +for more details on the API. + +{% include_example java/org/apache/spark/examples/ml/JavaInteractionExample.java %} +</div> +</div> ## Normalizer |