aboutsummaryrefslogtreecommitdiff
path: root/docs/ml-features.md
diff options
context:
space:
mode:
authorchie8842 <hayashidac@nttdata.co.jp>2016-11-08 13:45:37 +0000
committerSean Owen <sowen@cloudera.com>2016-11-08 13:45:37 +0000
commitee2e741ac16b01d9cae0eadd35af774547bbd415 (patch)
tree792d6d1460e93d7ab1e991d5df355df7c43c6819 /docs/ml-features.md
parentc291bd2745a8a2e4ba91d8697879eb8da10287e2 (diff)
downloadspark-ee2e741ac16b01d9cae0eadd35af774547bbd415.tar.gz
spark-ee2e741ac16b01d9cae0eadd35af774547bbd415.tar.bz2
spark-ee2e741ac16b01d9cae0eadd35af774547bbd415.zip
[SPARK-13770][DOCUMENTATION][ML] Document the ML feature Interaction
I created Scala and Java example and added documentation. Author: chie8842 <hayashidac@nttdata.co.jp> Closes #15658 from hayashidac/SPARK-13770.
Diffstat (limited to 'docs/ml-features.md')
-rw-r--r--docs/ml-features.md52
1 files changed, 52 insertions, 0 deletions
diff --git a/docs/ml-features.md b/docs/ml-features.md
index 352887d3ba..903177210d 100644
--- a/docs/ml-features.md
+++ b/docs/ml-features.md
@@ -729,6 +729,58 @@ for more details on the API.
</div>
</div>
+## Interaction
+
+`Interaction` is a `Transformer` which takes vector or double-valued columns, and generates a single vector column that contains the product of all combinations of one value from each input column.
+
+For example, if you have 2 vector type columns each of which has 3 dimensions as input columns, then then you'll get a 9-dimensional vector as the output column.
+
+**Examples**
+
+Assume that we have the following DataFrame with the columns "id1", "vec1", and "vec2":
+
+~~~~
+ id1|vec1 |vec2
+ ---|--------------|--------------
+ 1 |[1.0,2.0,3.0] |[8.0,4.0,5.0]
+ 2 |[4.0,3.0,8.0] |[7.0,9.0,8.0]
+ 3 |[6.0,1.0,9.0] |[2.0,3.0,6.0]
+ 4 |[10.0,8.0,6.0]|[9.0,4.0,5.0]
+ 5 |[9.0,2.0,7.0] |[10.0,7.0,3.0]
+ 6 |[1.0,1.0,4.0] |[2.0,8.0,4.0]
+~~~~
+
+Applying `Interaction` with those input columns,
+then `interactedCol` as the output column contains:
+
+~~~~
+ id1|vec1 |vec2 |interactedCol
+ ---|--------------|--------------|------------------------------------------------------
+ 1 |[1.0,2.0,3.0] |[8.0,4.0,5.0] |[8.0,4.0,5.0,16.0,8.0,10.0,24.0,12.0,15.0]
+ 2 |[4.0,3.0,8.0] |[7.0,9.0,8.0] |[56.0,72.0,64.0,42.0,54.0,48.0,112.0,144.0,128.0]
+ 3 |[6.0,1.0,9.0] |[2.0,3.0,6.0] |[36.0,54.0,108.0,6.0,9.0,18.0,54.0,81.0,162.0]
+ 4 |[10.0,8.0,6.0]|[9.0,4.0,5.0] |[360.0,160.0,200.0,288.0,128.0,160.0,216.0,96.0,120.0]
+ 5 |[9.0,2.0,7.0] |[10.0,7.0,3.0]|[450.0,315.0,135.0,100.0,70.0,30.0,350.0,245.0,105.0]
+ 6 |[1.0,1.0,4.0] |[2.0,8.0,4.0] |[12.0,48.0,24.0,12.0,48.0,24.0,48.0,192.0,96.0]
+~~~~
+
+<div class="codetabs">
+<div data-lang="scala" markdown="1">
+
+Refer to the [Interaction Scala docs](api/scala/index.html#org.apache.spark.ml.feature.Interaction)
+for more details on the API.
+
+{% include_example scala/org/apache/spark/examples/ml/InteractionExample.scala %}
+</div>
+
+<div data-lang="java" markdown="1">
+
+Refer to the [Interaction Java docs](api/java/org/apache/spark/ml/feature/Interaction.html)
+for more details on the API.
+
+{% include_example java/org/apache/spark/examples/ml/JavaInteractionExample.java %}
+</div>
+</div>
## Normalizer