[SPARK-4202][SQL] Simple DSL support for Scala UDF - spark

diff options

author	Cheng Lian <lian@databricks.com>	2014-11-03 13:20:33 -0800
committer	Michael Armbrust <michael@databricks.com>	2014-11-03 13:20:33 -0800
commit	c238fb423d1011bd1b1e6201d769b72e52664fc6 (patch)
tree	a1d4de68b51efcd5f0d0c29c7732545f45edee96 /docs
parent	24544fbce05665ab4999a1fe5aac434d29cd912c (diff)
download	spark-c238fb423d1011bd1b1e6201d769b72e52664fc6.tar.gz spark-c238fb423d1011bd1b1e6201d769b72e52664fc6.tar.bz2 spark-c238fb423d1011bd1b1e6201d769b72e52664fc6.zip

[SPARK-4202][SQL] Simple DSL support for Scala UDF

This feature is based on an offline discussion with mengxr, hopefully can be useful for the new MLlib pipeline API. For the following test snippet ```scala case class KeyValue(key: Int, value: String) val testData = sc.parallelize(1 to 10).map(i => KeyValue(i, i.toString)).toSchemaRDD def foo(a: Int, b: String) => a.toString + b ``` the newly introduced DSL enables the following syntax ```scala import org.apache.spark.sql.catalyst.dsl._ testData.select(Star(None), foo.call('key, 'value) as 'result) ``` which is equivalent to ```scala testData.registerTempTable("testData") sqlContext.registerFunction("foo", foo) sql("SELECT *, foo(key, value) AS result FROM testData") ``` Author: Cheng Lian <lian@databricks.com> Closes #3067 from liancheng/udf-dsl and squashes the following commits: f132818 [Cheng Lian] Adds DSL support for Scala UDF

Diffstat (limited to 'docs')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: