aboutsummaryrefslogtreecommitdiff
path: root/common
diff options
context:
space:
mode:
authorJosh Rosen <joshrosen@databricks.com>2016-03-17 20:00:56 -0700
committerJosh Rosen <joshrosen@databricks.com>2016-03-17 20:00:56 -0700
commit6c2d894a2f8f7a29ec6fc8163e41c24bb70c3109 (patch)
tree715e352a7e88818d315c92e8ae5c23e9a26b90ab /common
parent6037ed0a1d7ecbb77140ddf4d0192a1dc60316bb (diff)
downloadspark-6c2d894a2f8f7a29ec6fc8163e41c24bb70c3109.tar.gz
spark-6c2d894a2f8f7a29ec6fc8163e41c24bb70c3109.tar.bz2
spark-6c2d894a2f8f7a29ec6fc8163e41c24bb70c3109.zip
[SPARK-13921] Store serialized blocks as multiple chunks in MemoryStore
This patch modifies the BlockManager, MemoryStore, and several other storage components so that serialized cached blocks are stored as multiple small chunks rather than as a single contiguous ByteBuffer. This change will help to improve the efficiency of memory allocation and the accuracy of memory accounting when serializing blocks. Our current serialization code uses a ByteBufferOutputStream, which doubles and re-allocates its backing byte array; this increases the peak memory requirements during serialization (since we need to hold extra memory while expanding the array). In addition, we currently don't account for the extra wasted space at the end of the ByteBuffer's backing array, so a 129 megabyte serialized block may actually consume 256 megabytes of memory. After switching to storing blocks in multiple chunks, we'll be able to efficiently trim the backing buffers so that no space is wasted. This change is also a prerequisite to being able to cache blocks which are larger than 2GB (although full support for that depends on several other changes which have not bee implemented yet). Author: Josh Rosen <joshrosen@databricks.com> Closes #11748 from JoshRosen/chunked-block-serialization.
Diffstat (limited to 'common')
-rw-r--r--common/network-common/src/main/java/org/apache/spark/network/buffer/NettyManagedBuffer.java2
1 files changed, 1 insertions, 1 deletions
diff --git a/common/network-common/src/main/java/org/apache/spark/network/buffer/NettyManagedBuffer.java b/common/network-common/src/main/java/org/apache/spark/network/buffer/NettyManagedBuffer.java
index 4c8802af7a..acc49d968c 100644
--- a/common/network-common/src/main/java/org/apache/spark/network/buffer/NettyManagedBuffer.java
+++ b/common/network-common/src/main/java/org/apache/spark/network/buffer/NettyManagedBuffer.java
@@ -28,7 +28,7 @@ import io.netty.buffer.ByteBufInputStream;
/**
* A {@link ManagedBuffer} backed by a Netty {@link ByteBuf}.
*/
-public final class NettyManagedBuffer extends ManagedBuffer {
+public class NettyManagedBuffer extends ManagedBuffer {
private final ByteBuf buf;
public NettyManagedBuffer(ByteBuf buf) {