[SPARK-2650][SQL] Try to partially fix SPARK-2650 by adjusting initial buffer size and reducing memory allocation - spark

diff options

author	Cheng Lian <lian.cs.zju@gmail.com>	2014-08-05 18:50:37 -0700
committer	Michael Armbrust <michael@databricks.com>	2014-08-05 18:50:37 -0700
commit	d0ae3f3912104a8227cd964c42e229a297a48ffa (patch)
tree	6b16955b71b1de0cf1ea9005b10f1c011a954421 /python
parent	d94f5990e5685642a188db958b0341e5477b8efc (diff)
download	spark-d0ae3f3912104a8227cd964c42e229a297a48ffa.tar.gz spark-d0ae3f3912104a8227cd964c42e229a297a48ffa.tar.bz2 spark-d0ae3f3912104a8227cd964c42e229a297a48ffa.zip

[SPARK-2650][SQL] Try to partially fix SPARK-2650 by adjusting initial buffer size and reducing memory allocation

JIRA issue: [SPARK-2650](https://issues.apache.org/jira/browse/SPARK-2650) Please refer to [comments](https://issues.apache.org/jira/browse/SPARK-2650?focusedCommentId=14084397&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14084397) of SPARK-2650 for some other details. This PR adjusts the initial in-memory columnar buffer size to 1MB, same as the default value of Shark's `shark.column.partitionSize.mb` property when running in local mode. Will add Shark style partition size estimation in another PR. Also, before this PR, `NullableColumnBuilder` copies the whole buffer to add the null positions section, and then `CompressibleColumnBuilder` copies and compresses the buffer again, even if compression is disabled (`PassThrough` compression scheme is used to disable compression). In this PR the first buffer copy is eliminated to reduce memory consumption. Author: Cheng Lian <lian.cs.zju@gmail.com> Closes #1769 from liancheng/spark-2650 and squashes the following commits: 88a042e [Cheng Lian] Fixed method visibility and removed dead code 001f2e5 [Cheng Lian] Try fixing SPARK-2650 by adjusting initial buffer size and reducing memory allocation

Diffstat (limited to 'python')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: