diff options
author | Cheng Hao <hao.cheng@intel.com> | 2014-12-17 13:39:36 -0800 |
---|---|---|
committer | Michael Armbrust <michael@databricks.com> | 2014-12-17 13:39:36 -0800 |
commit | 636d9fc450faaa0d8e82e0d34bb7b791e3812cb7 (patch) | |
tree | ab0de7c89131b6bda143dc51228df6410f3eea8a /python | |
parent | 902e4d54acbc3c88163a5c6447aff68ed57475c1 (diff) | |
download | spark-636d9fc450faaa0d8e82e0d34bb7b791e3812cb7.tar.gz spark-636d9fc450faaa0d8e82e0d34bb7b791e3812cb7.tar.bz2 spark-636d9fc450faaa0d8e82e0d34bb7b791e3812cb7.zip |
[SPARK-3739] [SQL] Update the split num base on block size for table scanning
In local mode, Hadoop/Hive will ignore the "mapred.map.tasks", hence for small table file, it's always a single input split, however, SparkSQL doesn't honor that in table scanning, and we will get different result when do the Hive Compatibility test. This PR will fix that.
Author: Cheng Hao <hao.cheng@intel.com>
Closes #2589 from chenghao-intel/source_split and squashes the following commits:
dff38e7 [Cheng Hao] Remove the extra blank line
160a2b6 [Cheng Hao] fix the compiling bug
04d67f7 [Cheng Hao] Keep 1 split for small file in table scanning
Diffstat (limited to 'python')
0 files changed, 0 insertions, 0 deletions