aboutsummaryrefslogtreecommitdiff
path: root/yarn
diff options
context:
space:
mode:
authorXiangrui Meng <meng@databricks.com>2015-05-21 18:04:45 -0700
committerJoseph K. Bradley <joseph@databricks.com>2015-05-21 18:04:45 -0700
commit85b96372cf0fd055f89fc639f45c1f2cb02a378f (patch)
treeefdc362523217e9c8e3da9e4c2ba1743ad44d094 /yarn
parentf5db4b416c922db7a8f1b0c098b4f08647106231 (diff)
downloadspark-85b96372cf0fd055f89fc639f45c1f2cb02a378f.tar.gz
spark-85b96372cf0fd055f89fc639f45c1f2cb02a378f.tar.bz2
spark-85b96372cf0fd055f89fc639f45c1f2cb02a378f.zip
[SPARK-7219] [MLLIB] Output feature attributes in HashingTF
This PR updates `HashingTF` to output ML attributes that tell the number of features in the output column. We need to expand `UnaryTransformer` to support output metadata. A `df outputMetadata: Metadata` is not sufficient because the metadata may also depends on the input data. Though this is not true for `HashingTF`, I think it is reasonable to update `UnaryTransformer` in a separate PR. `checkParams` is added to verify common requirements for params. I will send a separate PR to use it in other test suites. jkbradley Author: Xiangrui Meng <meng@databricks.com> Closes #6308 from mengxr/SPARK-7219 and squashes the following commits: 9bd2922 [Xiangrui Meng] address comments e82a68a [Xiangrui Meng] remove sqlContext from test suite 995535b [Xiangrui Meng] Merge remote-tracking branch 'apache/master' into SPARK-7219 2194703 [Xiangrui Meng] add test for attributes 178ae23 [Xiangrui Meng] update HashingTF with tests 91a6106 [Xiangrui Meng] WIP
Diffstat (limited to 'yarn')
0 files changed, 0 insertions, 0 deletions