diff options
author | hyukjinkwon <gurwls223@gmail.com> | 2016-04-08 00:28:59 -0700 |
---|---|---|
committer | Reynold Xin <rxin@databricks.com> | 2016-04-08 00:28:59 -0700 |
commit | 725b860e2b7b675d95b10c46f2b329c30cd21faf (patch) | |
tree | fe3191cbdf6b58ea4c993c7d02691758f574423f /dev/deps/spark-deps-hadoop-2.4 | |
parent | 04fb7dba704afa4e20eb8c72d6568f7f55694157 (diff) | |
download | spark-725b860e2b7b675d95b10c46f2b329c30cd21faf.tar.gz spark-725b860e2b7b675d95b10c46f2b329c30cd21faf.tar.bz2 spark-725b860e2b7b675d95b10c46f2b329c30cd21faf.zip |
[SPARK-14103][SQL] Parse unescaped quotes in CSV data source.
## What changes were proposed in this pull request?
This PR resolves the problem during parsing unescaped quotes in input data. For example, currently the data below:
```
"a"b,ccc,ddd
e,f,g
```
produces a data below:
- **Before**
```bash
["a"b,ccc,ddd[\n]e,f,g] <- as a value.
```
- **After**
```bash
["a"b], [ccc], [ddd]
[e], [f], [g]
```
This PR bumps up the Univocity parser's version. This was fixed in `2.0.2`, https://github.com/uniVocity/univocity-parsers/issues/60.
## How was this patch tested?
Unit tests in `CSVSuite` and `sbt/sbt scalastyle`.
Author: hyukjinkwon <gurwls223@gmail.com>
Closes #12226 from HyukjinKwon/SPARK-14103-quote.
Diffstat (limited to 'dev/deps/spark-deps-hadoop-2.4')
-rw-r--r-- | dev/deps/spark-deps-hadoop-2.4 | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/dev/deps/spark-deps-hadoop-2.4 b/dev/deps/spark-deps-hadoop-2.4 index d8d1840da5..23ff5cfa2e 100644 --- a/dev/deps/spark-deps-hadoop-2.4 +++ b/dev/deps/spark-deps-hadoop-2.4 @@ -167,7 +167,7 @@ stax-api-1.0.1.jar stream-2.7.0.jar stringtemplate-3.2.1.jar super-csv-2.2.0.jar -univocity-parsers-1.5.6.jar +univocity-parsers-2.0.2.jar xbean-asm5-shaded-4.4.jar xmlenc-0.52.jar xz-1.0.jar |