[SPARK-19459][SQL] Add Hive datatype (char/varchar) to StructField metadata - spark

diff options

author	Herman van Hovell <hvanhovell@databricks.com>	2017-02-10 11:06:57 -0800
committer	Wenchen Fan <wenchen@databricks.com>	2017-02-10 11:06:57 -0800
commit	de8a03e68202647555e30fffba551f65bc77608d (patch)
tree	f529ed7b5fe76475226cef8a99061c0bec235198 /mllib-local
parent	dadff5f0789cce7cf3728a8adaab42118e5dc019 (diff)
download	spark-de8a03e68202647555e30fffba551f65bc77608d.tar.gz spark-de8a03e68202647555e30fffba551f65bc77608d.tar.bz2 spark-de8a03e68202647555e30fffba551f65bc77608d.zip

[SPARK-19459][SQL] Add Hive datatype (char/varchar) to StructField metadata

## What changes were proposed in this pull request? Reading from an existing ORC table which contains `char` or `varchar` columns can fail with a `ClassCastException` if the table metadata has been created using Spark. This is caused by the fact that spark internally replaces `char` and `varchar` columns with a `string` column. This PR fixes this by adding the hive type to the `StructField's` metadata under the `HIVE_TYPE_STRING` key. This is picked up by the `HiveClient` and the ORC reader, see https://github.com/apache/spark/pull/16060 for more details on how the metadata is used. ## How was this patch tested? Added a regression test to `OrcSourceSuite`. Author: Herman van Hovell <hvanhovell@databricks.com> Closes #16804 from hvanhovell/SPARK-19459.

Diffstat (limited to 'mllib-local')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: