From 21b5478ed6f7eb1056f40d11f0400076e843c74e Mon Sep 17 00:00:00 2001 From: Neal Wiggins Date: Wed, 20 Nov 2013 16:19:25 -0800 Subject: Fix Kryo Serializer buffer inconsistency The documentation here is inconsistent with the coded default and other documentation. --- docs/tuning.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'docs/tuning.md') diff --git a/docs/tuning.md b/docs/tuning.md index f491ae9b95..f33fda37eb 100644 --- a/docs/tuning.md +++ b/docs/tuning.md @@ -67,7 +67,7 @@ The [Kryo documentation](http://code.google.com/p/kryo/) describes more advanced registration options, such as adding custom serialization code. If your objects are large, you may also need to increase the `spark.kryoserializer.buffer.mb` -system property. The default is 32, but this value needs to be large enough to hold the *largest* +system property. The default is 2, but this value needs to be large enough to hold the *largest* object you will serialize. Finally, if you don't register your classes, Kryo will still work, but it will have to store the -- cgit v1.2.3 From 08afef37a07c501b1ba14e3d6da445712852ca1e Mon Sep 17 00:00:00 2001 From: Andrew Ash Date: Mon, 25 Nov 2013 17:08:52 -0800 Subject: Update tuning.md Clarify when serializer is used based on recent user@ mailing list discussion. --- docs/tuning.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) (limited to 'docs/tuning.md') diff --git a/docs/tuning.md b/docs/tuning.md index f33fda37eb..a4be188169 100644 --- a/docs/tuning.md +++ b/docs/tuning.md @@ -39,7 +39,8 @@ in your operations) and performance. It provides two serialization libraries: for best performance. You can switch to using Kryo by calling `System.setProperty("spark.serializer", "org.apache.spark.serializer.KryoSerializer")` -*before* creating your SparkContext. The only reason it is not the default is because of the custom +*before* creating your SparkContext. This setting configures the serializer used for not only shuffling data between worker +nodes but also when serializing RDDs to disk. The only reason Kryo is not the default is because of the custom registration requirement, but we recommend trying it in any network-intensive application. Finally, to register your classes with Kryo, create a public class that extends -- cgit v1.2.3