diff options
author | hyukjinkwon <gurwls223@gmail.com> | 2016-05-12 22:31:14 -0700 |
---|---|---|
committer | Davies Liu <davies.liu@gmail.com> | 2016-05-12 22:31:14 -0700 |
commit | 51841d77d99a858f8fa1256e923b0364b9b28fa0 (patch) | |
tree | 8c190a69054ed9ed4da636db7029a0eef7a29188 /sql/core/src/test/resources | |
parent | eda2800d44843b6478e22d2c99bca4af7e9c9613 (diff) | |
download | spark-51841d77d99a858f8fa1256e923b0364b9b28fa0.tar.gz spark-51841d77d99a858f8fa1256e923b0364b9b28fa0.tar.bz2 spark-51841d77d99a858f8fa1256e923b0364b9b28fa0.zip |
[SPARK-13866] [SQL] Handle decimal type in CSV inference at CSV data source.
## What changes were proposed in this pull request?
https://issues.apache.org/jira/browse/SPARK-13866
This PR adds the support to infer `DecimalType`.
Here are the rules between `IntegerType`, `LongType` and `DecimalType`.
#### Infering Types
1. `IntegerType` and then `LongType`are tried first.
```scala
Int.MaxValue => IntegerType
Long.MaxValue => LongType
```
2. If it fails, try `DecimalType`.
```scala
(Long.MaxValue + 1) => DecimalType(20, 0)
```
This does not try to infer this as `DecimalType` when scale is less than 0.
3. if it fails, try `DoubleType`
```scala
0.1 => DoubleType // This is failed to be inferred as `DecimalType` because it has the scale, 1.
```
#### Compatible Types (Merging Types)
For merging types, this is the same with JSON data source. If `DecimalType` is not capable, then it becomes `DoubleType`
## How was this patch tested?
Unit tests were used and `./dev/run_tests` for code style test.
Author: hyukjinkwon <gurwls223@gmail.com>
Author: Hyukjin Kwon <gurwls223@gmail.com>
Closes #11724 from HyukjinKwon/SPARK-13866.
Diffstat (limited to 'sql/core/src/test/resources')
-rw-r--r-- | sql/core/src/test/resources/decimal.csv | 7 |
1 files changed, 7 insertions, 0 deletions
diff --git a/sql/core/src/test/resources/decimal.csv b/sql/core/src/test/resources/decimal.csv new file mode 100644 index 0000000000..870f6aaf1b --- /dev/null +++ b/sql/core/src/test/resources/decimal.csv @@ -0,0 +1,7 @@ +~ decimal field has integer, integer and decimal values. The last value cannot fit to a long +~ long field has integer, long and integer values. +~ double field has double, double and decimal values. +decimal,long,double +1,1,0.1 +1,9223372036854775807,1.0 +92233720368547758070,1,92233720368547758070 |