[SPARK-3036][SPARK-3037][SQL] Add MapType/ArrayType containing null value support to Parquet. - spark

diff options

author	Takuya UESHIN <ueshin@happy-camper.st>	2014-08-26 18:28:41 -0700
committer	Michael Armbrust <michael@databricks.com>	2014-08-26 18:28:41 -0700
commit	727cb25bcc29481d6b744abef1ca091e64b5f91f (patch)
tree	4edc54a23a8f4581e931ced97ddda2a7a5da4085 /docs/running-on-mesos.md
parent	73b3089b8d2901dab11bb1ef6f46c29625b677fe (diff)
download	spark-727cb25bcc29481d6b744abef1ca091e64b5f91f.tar.gz spark-727cb25bcc29481d6b744abef1ca091e64b5f91f.tar.bz2 spark-727cb25bcc29481d6b744abef1ca091e64b5f91f.zip

[SPARK-3036][SPARK-3037][SQL] Add MapType/ArrayType containing null value support to Parquet.

JIRA: - https://issues.apache.org/jira/browse/SPARK-3036 - https://issues.apache.org/jira/browse/SPARK-3037 Currently this uses the following Parquet schema for `MapType` when `valueContainsNull` is `true`: ``` message root { optional group a (MAP) { repeated group map (MAP_KEY_VALUE) { required int32 key; optional int32 value; } } } ``` for `ArrayType` when `containsNull` is `true`: ``` message root { optional group a (LIST) { repeated group bag { optional int32 array; } } } ``` We have to think about compatibilities with older version of Spark or Hive or others I mentioned in the JIRA issues. Notice: This PR is based on #1963 and #1889. Please check them first. /cc marmbrus, yhuai Author: Takuya UESHIN <ueshin@happy-camper.st> Closes #2032 from ueshin/issues/SPARK-3036_3037 and squashes the following commits: 4e8e9e7 [Takuya UESHIN] Add ArrayType containing null value support to Parquet. 013c2ca [Takuya UESHIN] Add MapType containing null value support to Parquet. 62989de [Takuya UESHIN] Merge branch 'issues/SPARK-2969' into issues/SPARK-3036_3037 8e38b53 [Takuya UESHIN] Merge branch 'issues/SPARK-3063' into issues/SPARK-3036_3037

Diffstat (limited to 'docs/running-on-mesos.md')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: