aboutsummaryrefslogtreecommitdiff
path: root/mllib
diff options
context:
space:
mode:
authorDavies Liu <davies@databricks.com>2016-06-10 14:32:43 -0700
committerDavies Liu <davies.liu@gmail.com>2016-06-10 14:32:43 -0700
commitaec502d9114ad8e18bfbbd63f38780e076d326d1 (patch)
tree5aa6b1479a6f677b4690816a96000ac064aa0338 /mllib
parente05a2feebe928df691d5a8f42f22e088c6263dcf (diff)
downloadspark-aec502d9114ad8e18bfbbd63f38780e076d326d1.tar.gz
spark-aec502d9114ad8e18bfbbd63f38780e076d326d1.tar.bz2
spark-aec502d9114ad8e18bfbbd63f38780e076d326d1.zip
[SPARK-15654] [SQL] fix non-splitable files for text based file formats
## What changes were proposed in this pull request? Currently, we always split the files when it's bigger than maxSplitBytes, but Hadoop LineRecordReader does not respect the splits for compressed files correctly, we should have a API for FileFormat to check whether the file could be splitted or not. This PR is based on #13442, closes #13442 ## How was this patch tested? add regression tests. Author: Davies Liu <davies@databricks.com> Closes #13531 from davies/fix_split.
Diffstat (limited to 'mllib')
-rw-r--r--mllib/src/main/scala/org/apache/spark/ml/source/libsvm/LibSVMRelation.scala2
1 files changed, 1 insertions, 1 deletions
diff --git a/mllib/src/main/scala/org/apache/spark/ml/source/libsvm/LibSVMRelation.scala b/mllib/src/main/scala/org/apache/spark/ml/source/libsvm/LibSVMRelation.scala
index 7629369ab1..b5b2a681e9 100644
--- a/mllib/src/main/scala/org/apache/spark/ml/source/libsvm/LibSVMRelation.scala
+++ b/mllib/src/main/scala/org/apache/spark/ml/source/libsvm/LibSVMRelation.scala
@@ -112,7 +112,7 @@ private[libsvm] class LibSVMOutputWriter(
*/
// If this is moved or renamed, please update DataSource's backwardCompatibilityMap.
@Since("1.6.0")
-class LibSVMFileFormat extends FileFormat with DataSourceRegister {
+class LibSVMFileFormat extends TextBasedFileFormat with DataSourceRegister {
@Since("1.6.0")
override def shortName(): String = "libsvm"