diff options
author | Herman van Hovell <hvanhovell@databricks.com> | 2016-09-07 00:44:07 +0200 |
---|---|---|
committer | Herman van Hovell <hvanhovell@databricks.com> | 2016-09-07 00:44:07 +0200 |
commit | 4f769b903bc9822c262f0a15f5933cc05c67923f (patch) | |
tree | 89e4e98fc53f256e1f8064e05041e4e7e7c402ec /dev/.gitignore | |
parent | 29cfab3f1524c5690be675d24dda0a9a1806d6ff (diff) | |
download | spark-4f769b903bc9822c262f0a15f5933cc05c67923f.tar.gz spark-4f769b903bc9822c262f0a15f5933cc05c67923f.tar.bz2 spark-4f769b903bc9822c262f0a15f5933cc05c67923f.zip |
[SPARK-17296][SQL] Simplify parser join processing.
## What changes were proposed in this pull request?
Join processing in the parser relies on the fact that the grammar produces a right nested trees, for instance the parse tree for `select * from a join b join c` is expected to produce a tree similar to `JOIN(a, JOIN(b, c))`. However there are cases in which this (invariant) is violated, like:
```sql
SELECT COUNT(1)
FROM test T1
CROSS JOIN test T2
JOIN test T3
ON T3.col = T1.col
JOIN test T4
ON T4.col = T1.col
```
In this case the parser returns a tree in which Joins are located on both the left and the right sides of the parent join node.
This PR introduces a different grammar rule which does not make this assumption. The new rule takes a relation and searches for zero or more joined relations. As a bonus processing is much easier.
## How was this patch tested?
Existing tests and I have added a regression test to the plan parser suite.
Author: Herman van Hovell <hvanhovell@databricks.com>
Closes #14867 from hvanhovell/SPARK-17296.
Diffstat (limited to 'dev/.gitignore')
0 files changed, 0 insertions, 0 deletions