aboutsummaryrefslogtreecommitdiff
path: root/pom.xml
diff options
context:
space:
mode:
authorLiang-Chi Hsieh <viirya@gmail.com>2015-05-17 15:42:21 +0800
committerCheng Lian <lian@databricks.com>2015-05-17 15:42:21 +0800
commit339905578790fa37fcad9684b859b443313a5aa2 (patch)
tree4c17f064797533b45b7f5f86924691b7319d4b8f /pom.xml
parentedf09ea1bd4bf7692e0085ad9c70cb1bfc8d06d8 (diff)
downloadspark-339905578790fa37fcad9684b859b443313a5aa2.tar.gz
spark-339905578790fa37fcad9684b859b443313a5aa2.tar.bz2
spark-339905578790fa37fcad9684b859b443313a5aa2.zip
[SPARK-7447] [SQL] Don't re-merge Parquet schema when the relation is deserialized
JIRA: https://issues.apache.org/jira/browse/SPARK-7447 `MetadataCache` in `ParquetRelation2` is annotated as `transient`. When `ParquetRelation2` is deserialized, we ask `MetadataCache` to refresh and perform schema merging again. It is time-consuming especially for very many parquet files. With the new `FSBasedParquetRelation`, although `MetadataCache` is not `transient` now, `MetadataCache.refresh()` still performs schema merging again when the relation is deserialized. Author: Liang-Chi Hsieh <viirya@gmail.com> Closes #6012 from viirya/without_remerge_schema and squashes the following commits: 2663957 [Liang-Chi Hsieh] Merge remote-tracking branch 'upstream/master' into without_remerge_schema 6ac7d93 [Liang-Chi Hsieh] Merge remote-tracking branch 'upstream/master' into without_remerge_schema b0fc09b [Liang-Chi Hsieh] Don't generate and merge parquetSchema multiple times.
Diffstat (limited to 'pom.xml')
0 files changed, 0 insertions, 0 deletions