diff options
author | Cheng Lian <lian.cs.zju@gmail.com> | 2014-08-05 18:50:37 -0700 |
---|---|---|
committer | Michael Armbrust <michael@databricks.com> | 2014-08-05 18:50:37 -0700 |
commit | d0ae3f3912104a8227cd964c42e229a297a48ffa (patch) | |
tree | 6b16955b71b1de0cf1ea9005b10f1c011a954421 /python | |
parent | d94f5990e5685642a188db958b0341e5477b8efc (diff) | |
download | spark-d0ae3f3912104a8227cd964c42e229a297a48ffa.tar.gz spark-d0ae3f3912104a8227cd964c42e229a297a48ffa.tar.bz2 spark-d0ae3f3912104a8227cd964c42e229a297a48ffa.zip |
[SPARK-2650][SQL] Try to partially fix SPARK-2650 by adjusting initial buffer size and reducing memory allocation
JIRA issue: [SPARK-2650](https://issues.apache.org/jira/browse/SPARK-2650)
Please refer to [comments](https://issues.apache.org/jira/browse/SPARK-2650?focusedCommentId=14084397&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14084397) of SPARK-2650 for some other details.
This PR adjusts the initial in-memory columnar buffer size to 1MB, same as the default value of Shark's `shark.column.partitionSize.mb` property when running in local mode. Will add Shark style partition size estimation in another PR.
Also, before this PR, `NullableColumnBuilder` copies the whole buffer to add the null positions section, and then `CompressibleColumnBuilder` copies and compresses the buffer again, even if compression is disabled (`PassThrough` compression scheme is used to disable compression). In this PR the first buffer copy is eliminated to reduce memory consumption.
Author: Cheng Lian <lian.cs.zju@gmail.com>
Closes #1769 from liancheng/spark-2650 and squashes the following commits:
88a042e [Cheng Lian] Fixed method visibility and removed dead code
001f2e5 [Cheng Lian] Try fixing SPARK-2650 by adjusting initial buffer size and reducing memory allocation
Diffstat (limited to 'python')
0 files changed, 0 insertions, 0 deletions