diff options
Diffstat (limited to 'docs/tuning.md')
-rw-r--r-- | docs/tuning.md | 4 |
1 files changed, 2 insertions, 2 deletions
diff --git a/docs/tuning.md b/docs/tuning.md index cc069f0e84..78e10770a8 100644 --- a/docs/tuning.md +++ b/docs/tuning.md @@ -48,7 +48,7 @@ Spark automatically includes Kryo serializers for the many commonly-used core Sc in the AllScalaRegistrar from the [Twitter chill](https://github.com/twitter/chill) library. To register your own custom classes with Kryo, create a public class that extends -[`org.apache.spark.serializer.KryoRegistrator`](api/core/index.html#org.apache.spark.serializer.KryoRegistrator) and set the +[`org.apache.spark.serializer.KryoRegistrator`](api/scala/index.html#org.apache.spark.serializer.KryoRegistrator) and set the `spark.kryo.registrator` config property to point to it, as follows: {% highlight scala %} @@ -222,7 +222,7 @@ enough. Spark automatically sets the number of "map" tasks to run on each file a (though you can control it through optional parameters to `SparkContext.textFile`, etc), and for distributed "reduce" operations, such as `groupByKey` and `reduceByKey`, it uses the largest parent RDD's number of partitions. You can pass the level of parallelism as a second argument -(see the [`spark.PairRDDFunctions`](api/core/index.html#org.apache.spark.rdd.PairRDDFunctions) documentation), +(see the [`spark.PairRDDFunctions`](api/scala/index.html#org.apache.spark.rdd.PairRDDFunctions) documentation), or set the config property `spark.default.parallelism` to change the default. In general, we recommend 2-3 tasks per CPU core in your cluster. |