diff options
author | hyukjinkwon <gurwls223@gmail.com> | 2016-09-09 14:23:05 -0700 |
---|---|---|
committer | Davies Liu <davies.liu@gmail.com> | 2016-09-09 14:23:05 -0700 |
commit | f7d2143705c8c1baeed0bc62940f9dba636e705b (patch) | |
tree | 8067836599fbfb1a71595fb01551e0f775c6b644 /mllib/pom.xml | |
parent | a3981c28c956a82ccf5b1c61d45b6bd252d4abed (diff) | |
download | spark-f7d2143705c8c1baeed0bc62940f9dba636e705b.tar.gz spark-f7d2143705c8c1baeed0bc62940f9dba636e705b.tar.bz2 spark-f7d2143705c8c1baeed0bc62940f9dba636e705b.zip |
[SPARK-17354] [SQL] Partitioning by dates/timestamps should work with Parquet vectorized reader
## What changes were proposed in this pull request?
This PR fixes `ColumnVectorUtils.populate` so that Parquet vectorized reader can read partitioned table with dates/timestamps. This works fine with Parquet normal reader.
This is being only called within [VectorizedParquetRecordReader.java#L185](https://github.com/apache/spark/blob/master/sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedParquetRecordReader.java#L185).
When partition column types are explicitly given to `DateType` or `TimestampType` (rather than inferring the type of partition column), this fails with the exception below:
```
16/09/01 10:30:07 ERROR Executor: Exception in task 0.0 in stage 5.0 (TID 6)
java.lang.ClassCastException: java.lang.Integer cannot be cast to java.sql.Date
at org.apache.spark.sql.execution.vectorized.ColumnVectorUtils.populate(ColumnVectorUtils.java:89)
at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.initBatch(VectorizedParquetRecordReader.java:185)
at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.initBatch(VectorizedParquetRecordReader.java:204)
at org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat$$anonfun$buildReader$1.apply(ParquetFileFormat.scala:362)
...
```
## How was this patch tested?
Unit tests in `SQLQuerySuite`.
Author: hyukjinkwon <gurwls223@gmail.com>
Closes #14919 from HyukjinKwon/SPARK-17354.
Diffstat (limited to 'mllib/pom.xml')
0 files changed, 0 insertions, 0 deletions