[SPARK-10005] [SQL] Fixes schema merging for nested structs - spark

diff options

author	Cheng Lian <lian@databricks.com>	2015-08-16 10:17:58 -0700
committer	Yin Huai <yhuai@databricks.com>	2015-08-16 10:17:58 -0700
commit	ae2370e72f93db8a28b262e8252c55fe1fc9873c (patch)
tree	b3bf8b6699430bfd4f0b2ecef0103d40bf1d3f76 /R/DOCUMENTATION.md
parent	cf016075a006034c24c5b758edb279f3e151d25d (diff)
download	spark-ae2370e72f93db8a28b262e8252c55fe1fc9873c.tar.gz spark-ae2370e72f93db8a28b262e8252c55fe1fc9873c.tar.bz2 spark-ae2370e72f93db8a28b262e8252c55fe1fc9873c.zip

[SPARK-10005] [SQL] Fixes schema merging for nested structs

In case of schema merging, we only handled first level fields when converting Parquet groups to `InternalRow`s. Nested struct fields are not properly handled. For example, the schema of a Parquet file to be read can be: ``` message individual { required group f1 { optional binary f11 (utf8); } } ``` while the global schema is: ``` message global { required group f1 { optional binary f11 (utf8); optional int32 f12; } } ``` This PR fixes this issue by padding missing fields when creating actual converters. Author: Cheng Lian <lian@databricks.com> Closes #8228 from liancheng/spark-10005/nested-schema-merging.

Diffstat (limited to 'R/DOCUMENTATION.md')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: