diff options
author | Josh Rosen <joshrosen@databricks.com> | 2015-05-26 20:24:35 -0700 |
---|---|---|
committer | Yin Huai <yhuai@databricks.com> | 2015-05-26 20:24:35 -0700 |
commit | 0c33c7b4a66e47f6246f1b7f2b96f2c33126ec63 (patch) | |
tree | a6803c3bb9700b69a9f6d4e0e9932a649e54090c /conf/metrics.properties.template | |
parent | 03668348e29eb52c1a7d57a1e0ed7fca6c323890 (diff) | |
download | spark-0c33c7b4a66e47f6246f1b7f2b96f2c33126ec63.tar.gz spark-0c33c7b4a66e47f6246f1b7f2b96f2c33126ec63.tar.bz2 spark-0c33c7b4a66e47f6246f1b7f2b96f2c33126ec63.zip |
[SPARK-7858] [SQL] Use output schema, not relation schema, for data source input conversion
In `DataSourceStrategy.createPhysicalRDD`, we use the relation schema as the target schema for converting incoming rows into Catalyst rows. However, we should be using the output schema instead, since our scan might return a subset of the relation's columns.
This patch incorporates #6414 by liancheng, which fixes an issue in `SimpleTestRelation` that prevented this bug from being caught by our old tests:
> In `SimpleTextRelation`, we specified `needsConversion` to `true`, indicating that values produced by this testing relation should be of Scala types, and need to be converted to Catalyst types when necessary. However, we also used `Cast` to convert strings to expected data types. And `Cast` always produces values of Catalyst types, thus no conversion is done at all. This PR makes `SimpleTextRelation` produce Scala values so that data conversion code paths can be properly tested.
Closes #5986.
Author: Josh Rosen <joshrosen@databricks.com>
Author: Cheng Lian <lian@databricks.com>
Author: Cheng Lian <liancheng@users.noreply.github.com>
Closes #6400 from JoshRosen/SPARK-7858 and squashes the following commits:
e71c866 [Josh Rosen] Re-fix bug so that the tests pass again
56b13e5 [Josh Rosen] Add regression test to hadoopFsRelationSuites
2169a0f [Josh Rosen] Remove use of SpecificMutableRow and BufferedIterator
6cd7366 [Josh Rosen] Fix SPARK-7858 by using output types for conversion.
5a00e66 [Josh Rosen] Add assertions in order to reproduce SPARK-7858
8ba195c [Cheng Lian] Merge 9968fba9979287aaa1f141ba18bfb9d4c116a3b3 into 61664732b25b35f94be35a42cde651cbfd0e02b7
9968fba [Cheng Lian] Tests the data type conversion code paths
Diffstat (limited to 'conf/metrics.properties.template')
0 files changed, 0 insertions, 0 deletions