aboutsummaryrefslogtreecommitdiff
path: root/dev/deps
diff options
context:
space:
mode:
authorhyukjinkwon <gurwls223@gmail.com>2016-09-21 10:35:29 +0100
committerSean Owen <sowen@cloudera.com>2016-09-21 10:35:29 +0100
commit25a020be99b6a540e4001e59e40d5d1c8aa53812 (patch)
treeaf8108bb755277af338209736be43fe81ff58e7b /dev/deps
parent57dc326bd00cf0a49da971e9c573c48ae28acaa2 (diff)
downloadspark-25a020be99b6a540e4001e59e40d5d1c8aa53812.tar.gz
spark-25a020be99b6a540e4001e59e40d5d1c8aa53812.tar.bz2
spark-25a020be99b6a540e4001e59e40d5d1c8aa53812.zip
[SPARK-17583][SQL] Remove uesless rowSeparator variable and set auto-expanding buffer as default for maxCharsPerColumn option in CSV
## What changes were proposed in this pull request? This PR includes the changes below: 1. Upgrade Univocity library from 2.1.1 to 2.2.1 This includes some performance improvement and also enabling auto-extending buffer in `maxCharsPerColumn` option in CSV. Please refer the [release notes](https://github.com/uniVocity/univocity-parsers/releases). 2. Remove useless `rowSeparator` variable existing in `CSVOptions` We have this unused variable in [CSVOptions.scala#L127](https://github.com/apache/spark/blob/29952ed096fd2a0a19079933ff691671d6f00835/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVOptions.scala#L127) but it seems possibly causing confusion that it actually does not care of `\r\n`. For example, we have an issue open about this, [SPARK-17227](https://issues.apache.org/jira/browse/SPARK-17227), describing this variable. This variable is virtually not being used because we rely on `LineRecordReader` in Hadoop which deals with only both `\n` and `\r\n`. 3. Set the default value of `maxCharsPerColumn` to auto-expending. We are setting 1000000 for the length of each column. It'd be more sensible we allow auto-expending rather than fixed length by default. To make sure, using `-1` is being described in the release note, [2.2.0](https://github.com/uniVocity/univocity-parsers/releases/tag/v2.2.0). ## How was this patch tested? N/A Author: hyukjinkwon <gurwls223@gmail.com> Closes #15138 from HyukjinKwon/SPARK-17583.
Diffstat (limited to 'dev/deps')
-rw-r--r--dev/deps/spark-deps-hadoop-2.22
-rw-r--r--dev/deps/spark-deps-hadoop-2.32
-rw-r--r--dev/deps/spark-deps-hadoop-2.42
-rw-r--r--dev/deps/spark-deps-hadoop-2.62
-rw-r--r--dev/deps/spark-deps-hadoop-2.72
5 files changed, 5 insertions, 5 deletions
diff --git a/dev/deps/spark-deps-hadoop-2.2 b/dev/deps/spark-deps-hadoop-2.2
index a7259e25bf..f4f92c6d20 100644
--- a/dev/deps/spark-deps-hadoop-2.2
+++ b/dev/deps/spark-deps-hadoop-2.2
@@ -159,7 +159,7 @@ stax-api-1.0.1.jar
stream-2.7.0.jar
stringtemplate-3.2.1.jar
super-csv-2.2.0.jar
-univocity-parsers-2.1.1.jar
+univocity-parsers-2.2.1.jar
validation-api-1.1.0.Final.jar
xbean-asm5-shaded-4.4.jar
xmlenc-0.52.jar
diff --git a/dev/deps/spark-deps-hadoop-2.3 b/dev/deps/spark-deps-hadoop-2.3
index 6986ab572b..3db013f1a7 100644
--- a/dev/deps/spark-deps-hadoop-2.3
+++ b/dev/deps/spark-deps-hadoop-2.3
@@ -167,7 +167,7 @@ stax-api-1.0.1.jar
stream-2.7.0.jar
stringtemplate-3.2.1.jar
super-csv-2.2.0.jar
-univocity-parsers-2.1.1.jar
+univocity-parsers-2.2.1.jar
validation-api-1.1.0.Final.jar
xbean-asm5-shaded-4.4.jar
xmlenc-0.52.jar
diff --git a/dev/deps/spark-deps-hadoop-2.4 b/dev/deps/spark-deps-hadoop-2.4
index 75cccb352b..71710109a1 100644
--- a/dev/deps/spark-deps-hadoop-2.4
+++ b/dev/deps/spark-deps-hadoop-2.4
@@ -167,7 +167,7 @@ stax-api-1.0.1.jar
stream-2.7.0.jar
stringtemplate-3.2.1.jar
super-csv-2.2.0.jar
-univocity-parsers-2.1.1.jar
+univocity-parsers-2.2.1.jar
validation-api-1.1.0.Final.jar
xbean-asm5-shaded-4.4.jar
xmlenc-0.52.jar
diff --git a/dev/deps/spark-deps-hadoop-2.6 b/dev/deps/spark-deps-hadoop-2.6
index ef7b8a7d8d..cb30fda253 100644
--- a/dev/deps/spark-deps-hadoop-2.6
+++ b/dev/deps/spark-deps-hadoop-2.6
@@ -175,7 +175,7 @@ stax-api-1.0.1.jar
stream-2.7.0.jar
stringtemplate-3.2.1.jar
super-csv-2.2.0.jar
-univocity-parsers-2.1.1.jar
+univocity-parsers-2.2.1.jar
validation-api-1.1.0.Final.jar
xbean-asm5-shaded-4.4.jar
xercesImpl-2.9.1.jar
diff --git a/dev/deps/spark-deps-hadoop-2.7 b/dev/deps/spark-deps-hadoop-2.7
index 6356612537..9008aa80bc 100644
--- a/dev/deps/spark-deps-hadoop-2.7
+++ b/dev/deps/spark-deps-hadoop-2.7
@@ -176,7 +176,7 @@ stax-api-1.0.1.jar
stream-2.7.0.jar
stringtemplate-3.2.1.jar
super-csv-2.2.0.jar
-univocity-parsers-2.1.1.jar
+univocity-parsers-2.2.1.jar
validation-api-1.1.0.Final.jar
xbean-asm5-shaded-4.4.jar
xercesImpl-2.9.1.jar