[SPARK-10947] [SQL] With schema inference from JSON into a Dataframe, add option to infer all primitive object types as strings - spark

diff options

author	Stephen De Gennaro <stepheng@realitymine.com>	2015-10-26 19:55:10 -0700
committer	Yin Huai <yhuai@databricks.com>	2015-10-26 19:55:10 -0700
commit	82464fb2e02ca4e4d425017815090497b79dc93f (patch)
tree	38092b8f33e55405a41fcc00512e57b08f5fc0d8 /R/pkg/NAMESPACE
parent	d4c397a64af4cec899fdaa3e617ed20333cc567d (diff)
download	spark-82464fb2e02ca4e4d425017815090497b79dc93f.tar.gz spark-82464fb2e02ca4e4d425017815090497b79dc93f.tar.bz2 spark-82464fb2e02ca4e4d425017815090497b79dc93f.zip

[SPARK-10947] [SQL] With schema inference from JSON into a Dataframe, add option to infer all primitive object types as strings

Currently, when a schema is inferred from a JSON file using sqlContext.read.json, the primitive object types are inferred as string, long, boolean, etc. However, if the inferred type is too specific (JSON obviously does not enforce types itself), this can cause issues with merging dataframe schemas. This pull request adds the option "primitivesAsString" to the JSON DataFrameReader which when true (defaults to false if not set) will infer all primitives as strings. Below is an example usage of this new functionality. ``` val jsonDf = sqlContext.read.option("primitivesAsString", "true").json(sampleJsonFile) scala> jsonDf.printSchema() root |-- bigInteger: string (nullable = true) |-- boolean: string (nullable = true) |-- double: string (nullable = true) |-- integer: string (nullable = true) |-- long: string (nullable = true) |-- null: string (nullable = true) |-- string: string (nullable = true) ``` Author: Stephen De Gennaro <stepheng@realitymine.com> Closes #9249 from stephend-realitymine/stephend-primitives.

Diffstat (limited to 'R/pkg/NAMESPACE')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: