[SPARK-15654] [SQL] fix non-splitable files for text based file formats - spark

diff options

author	Davies Liu <davies@databricks.com>	2016-06-10 14:32:43 -0700
committer	Davies Liu <davies.liu@gmail.com>	2016-06-10 14:32:43 -0700
commit	aec502d9114ad8e18bfbbd63f38780e076d326d1 (patch)
tree	5aa6b1479a6f677b4690816a96000ac064aa0338 /examples/src
parent	e05a2feebe928df691d5a8f42f22e088c6263dcf (diff)
download	spark-aec502d9114ad8e18bfbbd63f38780e076d326d1.tar.gz spark-aec502d9114ad8e18bfbbd63f38780e076d326d1.tar.bz2 spark-aec502d9114ad8e18bfbbd63f38780e076d326d1.zip

[SPARK-15654] [SQL] fix non-splitable files for text based file formats

## What changes were proposed in this pull request? Currently, we always split the files when it's bigger than maxSplitBytes, but Hadoop LineRecordReader does not respect the splits for compressed files correctly, we should have a API for FileFormat to check whether the file could be splitted or not. This PR is based on #13442, closes #13442 ## How was this patch tested? add regression tests. Author: Davies Liu <davies@databricks.com> Closes #13531 from davies/fix_split.

Diffstat (limited to 'examples/src')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: