aboutsummaryrefslogtreecommitdiff
path: root/project/MimaExcludes.scala
diff options
context:
space:
mode:
authorDavies Liu <davies@databricks.com>2015-11-17 12:50:01 -0800
committerDavies Liu <davies.liu@gmail.com>2015-11-17 12:50:01 -0800
commit5aca6ad00c9d7fa43c725b8da4a10114a3a77421 (patch)
tree40c175bd3c9c424b8efb25c51fa2da55291ebc72 /project/MimaExcludes.scala
parentd98d1cb000c8c4e391d73ae86efd09f15e5d165c (diff)
downloadspark-5aca6ad00c9d7fa43c725b8da4a10114a3a77421.tar.gz
spark-5aca6ad00c9d7fa43c725b8da4a10114a3a77421.tar.bz2
spark-5aca6ad00c9d7fa43c725b8da4a10114a3a77421.zip
[SPARK-11767] [SQL] limit the size of caced batch
Currently the size of cached batch in only controlled by `batchSize` (default value is 10000), which does not work well with the size of serialized columns (for example, complex types). The memory used to build the batch is not accounted, it's easy to OOM (especially after unified memory management). This PR introduce a hard limit as 4M for total columns (up to 50 columns of uncompressed primitive columns). This also change the way to grow buffer, double it each time, then trim it once finished. cc liancheng Author: Davies Liu <davies@databricks.com> Closes #9760 from davies/cache_limit.
Diffstat (limited to 'project/MimaExcludes.scala')
0 files changed, 0 insertions, 0 deletions