aboutsummaryrefslogtreecommitdiff
path: root/docs/configuration.md
diff options
context:
space:
mode:
authorPatrick Wendell <pwendell@gmail.com>2014-04-27 17:40:56 -0700
committerPatrick Wendell <pwendell@gmail.com>2014-04-27 17:40:56 -0700
commit6b3c6e5dd8e74435f71ecdb224db532550ef407b (patch)
tree9ff06e8ce1d8dd01b2ccb5e4820adbdd02310298 /docs/configuration.md
parent3d9fb09681308abd2066d0d02f2438f5a17c9dd9 (diff)
downloadspark-6b3c6e5dd8e74435f71ecdb224db532550ef407b.tar.gz
spark-6b3c6e5dd8e74435f71ecdb224db532550ef407b.tar.bz2
spark-6b3c6e5dd8e74435f71ecdb224db532550ef407b.zip
SPARK-1145: Memory mapping with many small blocks can cause JVM allocation failures
This includes some minor code clean-up as well. The main change is that small files are not memory mapped. There is a nicer way to write that code block using Scala's `Try` but to make it easy to back port and as simple as possible, I opted for the more explicit but less pretty format. Author: Patrick Wendell <pwendell@gmail.com> Closes #43 from pwendell/block-iter-logging and squashes the following commits: 1cff512 [Patrick Wendell] Small issue from merge. 49f6c269 [Patrick Wendell] Merge remote-tracking branch 'apache/master' into block-iter-logging 4943351 [Patrick Wendell] Added a test and feedback on mateis review a637a18 [Patrick Wendell] Review feedback and adding rewind() when reading byte buffers. b76b95f [Patrick Wendell] Review feedback 4e1514e [Patrick Wendell] Don't memory map for small files d238b88 [Patrick Wendell] Some logging and clean-up
Diffstat (limited to 'docs/configuration.md')
-rw-r--r--docs/configuration.md9
1 files changed, 9 insertions, 0 deletions
diff --git a/docs/configuration.md b/docs/configuration.md
index 8d3442625b..b078c7c111 100644
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -132,6 +132,15 @@ Apart from these, the following properties are also available, and may be useful
</td>
</tr>
<tr>
+ <td>spark.storage.memoryMapThreshold</td>
+ <td>8192</td>
+ <td>
+ Size of a block, in bytes, above which Spark memory maps when reading a block from disk.
+ This prevents Spark from memory mapping very small blocks. In general, memory
+ mapping has high overhead for blocks close to or below the page size of the operating system.
+ </td>
+</tr>
+<tr>
<td>spark.tachyonStore.baseDir</td>
<td>System.getProperty("java.io.tmpdir")</td>
<td>