[SPARK-14290][CORE][NETWORK] avoid significant memory copy in netty's transferTo

## What changes were proposed in this pull request? When netty transfer data that is not `FileRegion`, data will be in format of `ByteBuf`, If the data is large, there will occur significant performance issue because there is memory copy underlying in `sun.nio.ch.IOUtil.write`, the CPU is 100% used, and network is very low. In this PR, if data size is large, we will split it into small chunks to call `WritableByteChannel.write()`, so that avoid wasting of memory copy. Because the data can't be written within a single write, and it will call `transferTo` multiple times. ## How was this patch tested? Spark unit test and manual test. Manual test: `sc.parallelize(Array(1,2,3),3).mapPartitions(a=>Array(new Array[Double](1024 * 1024 * 50)).iterator).reduce((a,b)=> a).length` For more details, please refer to [SPARK-14290](https://issues.apache.org/jira/browse/SPARK-14290) Author: Zhang, Liye <liye.zhang@intel.com> Closes #12083 from liyezhang556520/spark-14290.
author: Zhang, Liye <liye.zhang@intel.com> 2016-04-06 16:11:59 -0700
committer: Marcelo Vanzin <vanzin@cloudera.com> 2016-04-06 16:11:59 -0700
commit: c4bb02abf2c5b1724f2f848c79da5ebbf2584e45 (patch)
tree: 75879aa78b5e6d1a5a53f82595cb7c79cb1175e8 /common
parent: d717ae1fd74d125a9df21350a70e7c2b2d2b4786 (diff)
download: spark-c4bb02abf2c5b1724f2f848c79da5ebbf2584e45.tar.gz
spark-c4bb02abf2c5b1724f2f848c79da5ebbf2584e45.tar.bz2
spark-c4bb02abf2c5b1724f2f848c79da5ebbf2584e45.zip
1 files changed, 29 insertions, 1 deletions
diff --git a/common/network-common/src/main/java/org/apache/spark/network/protocol/MessageWithHeader.java b/common/network-common/src/main/java/org/apache/spark/network/protocol/MessageWithHeader.java
index 66227f96a1..4f8781b42a 100644
--- a/common/network-common/src/main/java/org/apache/spark/network/protocol/MessageWithHeader.java
+++ b/common/network-common/src/main/java/org/apache/spark/network/protocol/MessageWithHeader.java
@@ -18,6 +18,7 @@
 package org.apache.spark.network.protocol;
 
 import java.io.IOException;
+import java.nio.ByteBuffer;
 import java.nio.channels.WritableByteChannel;
 import javax.annotation.Nullable;
 
@@ -44,6 +45,14 @@ class MessageWithHeader extends AbstractReferenceCounted implements FileRegion {
   private long totalBytesTransferred;
 
   /**
+   * When the write buffer size is larger than this limit, I/O will be done in chunks of this size.
+   * The size should not be too large as it will waste underlying memory copy. e.g. If network
+   * avaliable buffer is smaller than this limit, the data cannot be sent within one single write
+   * operation while it still will make memory copy with this size.
+   */
+  private static final int NIO_BUFFER_LIMIT = 256 * 1024;
+
+  /**
    * Construct a new MessageWithHeader.
    *
    * @param managedBuffer the {@link ManagedBuffer} that the message body came from. This needs to
@@ -128,8 +137,27 @@ class MessageWithHeader extends AbstractReferenceCounted implements FileRegion {
   }
 
   private int copyByteBuf(ByteBuf buf, WritableByteChannel target) throws IOException {
-    int written = target.write(buf.nioBuffer());
+    ByteBuffer buffer = buf.nioBuffer();
+    int written = (buffer.remaining() <= NIO_BUFFER_LIMIT) ?
+      target.write(buffer) : writeNioBuffer(target, buffer);
     buf.skipBytes(written);
     return written;
   }
+
+  private int writeNioBuffer(
+      WritableByteChannel writeCh,
+      ByteBuffer buf) throws IOException {
+    int originalLimit = buf.limit();
+    int ret = 0;
+
+    try {
+      int ioSize = Math.min(buf.remaining(), NIO_BUFFER_LIMIT);
+      buf.limit(buf.position() + ioSize);
+      ret = writeCh.write(buf);
+    } finally {
+      buf.limit(originalLimit);
+    }
+
+    return ret;
+  }
 }
author	Zhang, Liye <liye.zhang@intel.com>	2016-04-06 16:11:59 -0700
committer	Marcelo Vanzin <vanzin@cloudera.com>	2016-04-06 16:11:59 -0700
commit	c4bb02abf2c5b1724f2f848c79da5ebbf2584e45 (patch)
tree	75879aa78b5e6d1a5a53f82595cb7c79cb1175e8 /common
parent	d717ae1fd74d125a9df21350a70e7c2b2d2b4786 (diff)
download	spark-c4bb02abf2c5b1724f2f848c79da5ebbf2584e45.tar.gz spark-c4bb02abf2c5b1724f2f848c79da5ebbf2584e45.tar.bz2 spark-c4bb02abf2c5b1724f2f848c79da5ebbf2584e45.zip