[SPARK-15056][SQL] Parse Unsupported Sampling Syntax and Issue Better Exceptions - spark

diff options

author	gatorsmile <gatorsmile@gmail.com>	2016-05-03 23:20:18 +0200
committer	Herman van Hovell <hvanhovell@questtec.nl>	2016-05-03 23:20:18 +0200
commit	71296c041e59159bd7c5836cf652c02843974077 (patch)
tree	b57b74bac7083bdf1cb352840d9da609051a2e46 /docs/css
parent	2e2a6211c4391d67edb2a252f26647fb059bc18b (diff)
download	spark-71296c041e59159bd7c5836cf652c02843974077.tar.gz spark-71296c041e59159bd7c5836cf652c02843974077.tar.bz2 spark-71296c041e59159bd7c5836cf652c02843974077.zip

[SPARK-15056][SQL] Parse Unsupported Sampling Syntax and Issue Better Exceptions

#### What changes were proposed in this pull request? Compared with the current Spark parser, there are two extra syntax are supported in Hive for sampling - In `On` clauses, `rand()` is used for indicating sampling on the entire row instead of an individual column. For example, ```SQL SELECT * FROM source TABLESAMPLE(BUCKET 3 OUT OF 32 ON rand()) s; ``` - Users can specify the total length to be read. For example, ```SQL SELECT * FROM source TABLESAMPLE(100M) s; ``` Below is the link for references: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Sampling This PR is to parse and capture these two extra syntax, and issue a better error message. #### How was this patch tested? Added test cases to verify the thrown exceptions Author: gatorsmile <gatorsmile@gmail.com> Closes #12838 from gatorsmile/bucketOnRand.

Diffstat (limited to 'docs/css')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: