diff options
author | Sadhan Sood <sadhan@tellapart.com> | 2015-02-04 19:18:06 -0800 |
---|---|---|
committer | Cheng Lian <lian@databricks.com> | 2015-02-04 19:18:06 -0800 |
commit | dba98bf6987ec39380f1a5b0ca2772b694452231 (patch) | |
tree | 2dfe9c5ed122e7d09c26e144be4c8d1269ef3f7e /python/pyspark/sql.py | |
parent | 1fbd124b1bd6159086d8e88b139ce0817af02322 (diff) | |
download | spark-dba98bf6987ec39380f1a5b0ca2772b694452231.tar.gz spark-dba98bf6987ec39380f1a5b0ca2772b694452231.tar.bz2 spark-dba98bf6987ec39380f1a5b0ca2772b694452231.zip |
[SPARK-4520] [SQL] This pr fixes the ArrayIndexOutOfBoundsException as r...
...aised in SPARK-4520.
The exception is thrown only for a thrift generated parquet file. The array element schema name is assumed as "array" as per ParquetAvro but for thrift generated parquet files, it is array_name + "_tuple". This leads to missing child of array group type and hence when the parquet rows are being materialized leads to the exception.
Author: Sadhan Sood <sadhan@tellapart.com>
Closes #4148 from sadhan/SPARK-4520 and squashes the following commits:
c5ccde8 [Sadhan Sood] [SPARK-4520] [SQL] This pr fixes the ArrayIndexOutOfBoundsException as raised in SPARK-4520.
Diffstat (limited to 'python/pyspark/sql.py')
0 files changed, 0 insertions, 0 deletions