diff options
author | gatorsmile <gatorsmile@gmail.com> | 2016-03-16 13:11:11 -0700 |
---|---|---|
committer | Yin Huai <yhuai@databricks.com> | 2016-03-16 13:11:11 -0700 |
commit | c4bd57602c0b14188d364bb475631bf473d25082 (patch) | |
tree | d5c081e53719b8305f1fcb0061b2454462fb3d25 /python/pylintrc | |
parent | 1d1de28a3c3c3a4bc37dc7565b9178a712df493a (diff) | |
download | spark-c4bd57602c0b14188d364bb475631bf473d25082.tar.gz spark-c4bd57602c0b14188d364bb475631bf473d25082.tar.bz2 spark-c4bd57602c0b14188d364bb475631bf473d25082.zip |
[SPARK-12721][SQL] SQL Generation for Script Transformation
#### What changes were proposed in this pull request?
This PR is to convert to SQL from analyzed logical plans containing operator `ScriptTransformation`.
For example, below is the SQL containing `Transform`
```
SELECT TRANSFORM (a, b, c, d) USING 'cat' FROM parquet_t2
```
Its logical plan is like
```
ScriptTransformation [a#210L,b#211L,c#212L,d#213L], cat, [key#208,value#209], HiveScriptIOSchema(List(),List(),Some(org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe),Some(org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe),List((field.delim, )),List((field.delim, )),Some(org.apache.hadoop.hive.ql.exec.TextRecordReader),Some(org.apache.hadoop.hive.ql.exec.TextRecordWriter),true)
+- SubqueryAlias parquet_t2
+- Relation[a#210L,b#211L,c#212L,d#213L] ParquetRelation
```
The generated SQL will be like
```
SELECT TRANSFORM (`parquet_t2`.`a`, `parquet_t2`.`b`, `parquet_t2`.`c`, `parquet_t2`.`d`) USING 'cat' AS (`key` string, `value` string) FROM `default`.`parquet_t2`
```
#### How was this patch tested?
Seven test cases are added to `LogicalPlanToSQLSuite`.
Author: gatorsmile <gatorsmile@gmail.com>
Author: xiaoli <lixiao1983@gmail.com>
Author: Xiao Li <xiaoli@Xiaos-MacBook-Pro.local>
Closes #11503 from gatorsmile/transformToSQL.
Diffstat (limited to 'python/pylintrc')
0 files changed, 0 insertions, 0 deletions