aboutsummaryrefslogtreecommitdiff
path: root/docs
diff options
context:
space:
mode:
Diffstat (limited to 'docs')
-rw-r--r--docs/configuration.md10
-rw-r--r--docs/quick-start.md2
-rw-r--r--docs/scala-programming-guide.md2
-rw-r--r--docs/tuning.md10
4 files changed, 12 insertions, 12 deletions
diff --git a/docs/configuration.md b/docs/configuration.md
index 55df18b6fb..58e9434bdc 100644
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -36,13 +36,13 @@ there are at least five properties that you will commonly want to control:
</tr>
<tr>
<td>spark.serializer</td>
- <td>org.apache.spark.JavaSerializer</td>
+ <td>org.apache.spark.serializer.<br />JavaSerializer</td>
<td>
Class to use for serializing objects that will be sent over the network or need to be cached
in serialized form. The default of Java serialization works with any Serializable Java object but is
- quite slow, so we recommend <a href="tuning.html">using <code>org.apache.spark.KryoSerializer</code>
+ quite slow, so we recommend <a href="tuning.html">using <code>org.apache.spark.serializer.KryoSerializer</code>
and configuring Kryo serialization</a> when speed is necessary. Can be any subclass of
- <a href="api/core/index.html#org.apache.spark.Serializer"><code>org.apache.spark.Serializer</code></a>.
+ <a href="api/core/index.html#org.apache.spark.serializer.Serializer"><code>org.apache.spark.Serializer</code></a>.
</td>
</tr>
<tr>
@@ -51,7 +51,7 @@ there are at least five properties that you will commonly want to control:
<td>
If you use Kryo serialization, set this class to register your custom classes with Kryo.
It should be set to a class that extends
- <a href="api/core/index.html#org.apache.spark.KryoRegistrator"><code>KryoRegistrator</code></a>.
+ <a href="api/core/index.html#org.apache.spark.serializer.KryoRegistrator"><code>KryoRegistrator</code></a>.
See the <a href="tuning.html#data-serialization">tuning guide</a> for more details.
</td>
</tr>
@@ -171,7 +171,7 @@ Apart from these, the following properties are also available, and may be useful
</tr>
<tr>
<td>spark.closure.serializer</td>
- <td>org.apache.spark.JavaSerializer</td>
+ <td>org.apache.spark.serializer.<br />JavaSerializer</td>
<td>
Serializer class to use for closures. Generally Java is fine unless your distributed functions
(e.g. map functions) reference large objects in the driver program.
diff --git a/docs/quick-start.md b/docs/quick-start.md
index 8cbc55bed2..70c3df8095 100644
--- a/docs/quick-start.md
+++ b/docs/quick-start.md
@@ -109,7 +109,7 @@ We'll create a very simple Spark job in Scala. So simple, in fact, that it's nam
{% highlight scala %}
/*** SimpleJob.scala ***/
import org.apache.spark.SparkContext
-import SparkContext._
+import org.apache.spark.SparkContext._
object SimpleJob {
def main(args: Array[String]) {
diff --git a/docs/scala-programming-guide.md b/docs/scala-programming-guide.md
index 2cf319a263..f7768e55fc 100644
--- a/docs/scala-programming-guide.md
+++ b/docs/scala-programming-guide.md
@@ -37,7 +37,7 @@ Finally, you need to import some Spark classes and implicit conversions into you
{% highlight scala %}
import org.apache.spark.SparkContext
-import SparkContext._
+import org.apache.spark.SparkContext._
{% endhighlight %}
# Initializing Spark
diff --git a/docs/tuning.md b/docs/tuning.md
index 3563d110c9..28d88a2659 100644
--- a/docs/tuning.md
+++ b/docs/tuning.md
@@ -38,17 +38,17 @@ in your operations) and performance. It provides two serialization libraries:
`Serializable` types and requires you to *register* the classes you'll use in the program in advance
for best performance.
-You can switch to using Kryo by calling `System.setProperty("spark.serializer", "org.apache.spark.KryoSerializer")`
+You can switch to using Kryo by calling `System.setProperty("spark.serializer", "org.apache.spark.serializer.KryoSerializer")`
*before* creating your SparkContext. The only reason it is not the default is because of the custom
registration requirement, but we recommend trying it in any network-intensive application.
Finally, to register your classes with Kryo, create a public class that extends
-[`org.apache.spark.KryoRegistrator`](api/core/index.html#org.apache.spark.KryoRegistrator) and set the
+[`org.apache.spark.serializer.KryoRegistrator`](api/core/index.html#org.apache.spark.serializer.KryoRegistrator) and set the
`spark.kryo.registrator` system property to point to it, as follows:
{% highlight scala %}
import com.esotericsoftware.kryo.Kryo
-import org.apache.spark.KryoRegistrator
+import org.apache.spark.serializer.KryoRegistrator
class MyRegistrator extends KryoRegistrator {
override def registerClasses(kryo: Kryo) {
@@ -58,7 +58,7 @@ class MyRegistrator extends KryoRegistrator {
}
// Make sure to set these properties *before* creating a SparkContext!
-System.setProperty("spark.serializer", "org.apache.spark.KryoSerializer")
+System.setProperty("spark.serializer", "org.apache.spark.serializer.KryoSerializer")
System.setProperty("spark.kryo.registrator", "mypackage.MyRegistrator")
val sc = new SparkContext(...)
{% endhighlight %}
@@ -217,7 +217,7 @@ enough. Spark automatically sets the number of "map" tasks to run on each file a
(though you can control it through optional parameters to `SparkContext.textFile`, etc), and for
distributed "reduce" operations, such as `groupByKey` and `reduceByKey`, it uses the largest
parent RDD's number of partitions. You can pass the level of parallelism as a second argument
-(see the [`spark.PairRDDFunctions`](api/core/index.html#org.apache.spark.PairRDDFunctions) documentation),
+(see the [`spark.PairRDDFunctions`](api/core/index.html#org.apache.spark.rdd.PairRDDFunctions) documentation),
or set the system property `spark.default.parallelism` to change the default.
In general, we recommend 2-3 tasks per CPU core in your cluster.