aboutsummaryrefslogtreecommitdiff
path: root/python/pyspark/sql
diff options
context:
space:
mode:
authorhyukjinkwon <gurwls223@gmail.com>2017-03-22 09:52:37 -0700
committerXiao Li <gatorsmile@gmail.com>2017-03-22 09:52:37 -0700
commit465818389aab1217c9de5c685cfaee3ffaec91bb (patch)
tree54691a40b9b00854f5c6fc343c0186c7bc214f22 /python/pyspark/sql
parent0caade634076034182e22318eb09a6df1c560576 (diff)
downloadspark-465818389aab1217c9de5c685cfaee3ffaec91bb.tar.gz
spark-465818389aab1217c9de5c685cfaee3ffaec91bb.tar.bz2
spark-465818389aab1217c9de5c685cfaee3ffaec91bb.zip
[SPARK-19949][SQL][FOLLOW-UP] Clean up parse modes and update related comments
## What changes were proposed in this pull request? This PR proposes to make `mode` options in both CSV and JSON to use `cass object` and fix some related comments related previous fix. Also, this PR modifies some tests related parse modes. ## How was this patch tested? Modified unit tests in both `CSVSuite.scala` and `JsonSuite.scala`. Author: hyukjinkwon <gurwls223@gmail.com> Closes #17377 from HyukjinKwon/SPARK-19949.
Diffstat (limited to 'python/pyspark/sql')
-rw-r--r--python/pyspark/sql/readwriter.py6
-rw-r--r--python/pyspark/sql/streaming.py2
2 files changed, 4 insertions, 4 deletions
diff --git a/python/pyspark/sql/readwriter.py b/python/pyspark/sql/readwriter.py
index 122e17f202..759c27507c 100644
--- a/python/pyspark/sql/readwriter.py
+++ b/python/pyspark/sql/readwriter.py
@@ -369,10 +369,8 @@ class DataFrameReader(OptionUtils):
:param maxCharsPerColumn: defines the maximum number of characters allowed for any given
value being read. If None is set, it uses the default value,
``-1`` meaning unlimited length.
- :param maxMalformedLogPerPartition: sets the maximum number of malformed rows Spark will
- log for each partition. Malformed records beyond this
- number will be ignored. If None is set, it
- uses the default value, ``10``.
+ :param maxMalformedLogPerPartition: this parameter is no longer used since Spark 2.2.0.
+ If specified, it is ignored.
:param mode: allows a mode for dealing with corrupt records during parsing. If None is
set, it uses the default value, ``PERMISSIVE``.
diff --git a/python/pyspark/sql/streaming.py b/python/pyspark/sql/streaming.py
index 288cc1e4f6..e227f9ceb5 100644
--- a/python/pyspark/sql/streaming.py
+++ b/python/pyspark/sql/streaming.py
@@ -625,6 +625,8 @@ class DataStreamReader(OptionUtils):
:param maxCharsPerColumn: defines the maximum number of characters allowed for any given
value being read. If None is set, it uses the default value,
``-1`` meaning unlimited length.
+ :param maxMalformedLogPerPartition: this parameter is no longer used since Spark 2.2.0.
+ If specified, it is ignored.
:param mode: allows a mode for dealing with corrupt records during parsing. If None is
set, it uses the default value, ``PERMISSIVE``.