[SPARK-11436] [SQL] rebind right encoder when join 2 datasets - spark

diff options

author	Wenchen Fan <wenchen@databricks.com>	2015-11-03 12:47:39 +0100
committer	Michael Armbrust <michael@databricks.com>	2015-11-03 12:47:39 +0100
commit	425ff03f5ac4f3ddda1ba06656e620d5426f4209 (patch)
tree	b4502d3db6c249c2d2b3de0b49f3c3afda66964a /yarn
parent	67e23b39ac3cdee06668fa9131951278b9731e29 (diff)
download	spark-425ff03f5ac4f3ddda1ba06656e620d5426f4209.tar.gz spark-425ff03f5ac4f3ddda1ba06656e620d5426f4209.tar.bz2 spark-425ff03f5ac4f3ddda1ba06656e620d5426f4209.zip

[SPARK-11436] [SQL] rebind right encoder when join 2 datasets

When we join 2 datasets, we will combine 2 encoders into a tupled one, and use it as the encoder for the jioned dataset. Assume both of the 2 encoders are flat, their `constructExpression`s both reference to the first element of input row. However, when we combine 2 encoders, the schema of input row changed, now the right encoder should reference to second element of input row. So we should rebind right encoder to let it know the new schema of input row before combine it. Author: Wenchen Fan <wenchen@databricks.com> Closes #9391 from cloud-fan/join and squashes the following commits: 846d3ab [Wenchen Fan] rebind right encoder when join 2 datasets

Diffstat (limited to 'yarn')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: