diff options
author | Yin Huai <huai@cse.ohio-state.edu> | 2014-08-14 10:46:33 -0700 |
---|---|---|
committer | Michael Armbrust <michael@databricks.com> | 2014-08-14 10:46:33 -0700 |
commit | add75d4831fdc35712bf8b737574ea0bc677c37c (patch) | |
tree | 467098512fb3d5c7c8a5c5fc83d7821bc884846b /docs/img/spark-logo-77x50px-hd.png | |
parent | 078f3fbda860e2f5de34153c55dfc3fecb4256e9 (diff) | |
download | spark-add75d4831fdc35712bf8b737574ea0bc677c37c.tar.gz spark-add75d4831fdc35712bf8b737574ea0bc677c37c.tar.bz2 spark-add75d4831fdc35712bf8b737574ea0bc677c37c.zip |
[SPARK-2927][SQL] Add a conf to configure if we always read Binary columns stored in Parquet as String columns
This PR adds a new conf flag `spark.sql.parquet.binaryAsString`. When it is `true`, if there is no parquet metadata file available to provide the schema of the data, we will always treat binary fields stored in parquet as string fields. This conf is used to provide a way to read string fields generated without UTF8 decoration.
JIRA: https://issues.apache.org/jira/browse/SPARK-2927
Author: Yin Huai <huai@cse.ohio-state.edu>
Closes #1855 from yhuai/parquetBinaryAsString and squashes the following commits:
689ffa9 [Yin Huai] Add missing "=".
80827de [Yin Huai] Unit test.
1765ca4 [Yin Huai] Use .toBoolean.
9d3f199 [Yin Huai] Merge remote-tracking branch 'upstream/master' into parquetBinaryAsString
5d436a1 [Yin Huai] The initial support of adding a conf to treat binary columns stored in Parquet as string columns.
Diffstat (limited to 'docs/img/spark-logo-77x50px-hd.png')
0 files changed, 0 insertions, 0 deletions