aboutsummaryrefslogtreecommitdiff
path: root/docs/tuning.md
diff options
context:
space:
mode:
authorAndrew Ash <andrew@andrewash.com>2014-12-10 15:01:15 -0800
committerPatrick Wendell <pwendell@gmail.com>2014-12-10 15:01:15 -0800
commit652b781a9b543cb17d7da91f5c3bebe5a02e0478 (patch)
tree1840a2dacc0226611a8c328e79021267345c7cd7 /docs/tuning.md
parent36bdb5b748ff670a9bafd787e40c9e142c9a85b9 (diff)
downloadspark-652b781a9b543cb17d7da91f5c3bebe5a02e0478.tar.gz
spark-652b781a9b543cb17d7da91f5c3bebe5a02e0478.tar.bz2
spark-652b781a9b543cb17d7da91f5c3bebe5a02e0478.zip
SPARK-3526 Add section about data locality to the tuning guide
cc kayousterhout I have a few outstanding questions from compiling this documentation: - What's the difference between NO_PREF and ANY? I understand the implications of the ordering but don't know what an example of each would be - Why is NO_PREF ahead of RACK_LOCAL? I would think it'd be better to schedule rack-local tasks ahead of no preference if you could only do one or the other. Is the idea to wait longer and hope for the rack-local tasks to turn into node-local or better? - Will there be a datacenter-local locality level in the future? Apache Cassandra for example has this level Author: Andrew Ash <andrew@andrewash.com> Closes #2519 from ash211/SPARK-3526 and squashes the following commits: 44cff28 [Andrew Ash] Link to spark.locality parameters rather than copying the list 6d5d966 [Andrew Ash] Stay focused on Spark, no astronaut architecture mumbo-jumbo 20e0e31 [Andrew Ash] SPARK-3526 Add section about data locality to the tuning guide
Diffstat (limited to 'docs/tuning.md')
-rw-r--r--docs/tuning.md33
1 files changed, 33 insertions, 0 deletions
diff --git a/docs/tuning.md b/docs/tuning.md
index 0e2447dd46..c4ca766328 100644
--- a/docs/tuning.md
+++ b/docs/tuning.md
@@ -233,6 +233,39 @@ Spark prints the serialized size of each task on the master, so you can look at
decide whether your tasks are too large; in general tasks larger than about 20 KB are probably
worth optimizing.
+## Data Locality
+
+Data locality can have a major impact on the performance of Spark jobs. If data and the code that
+operates on it are together than computation tends to be fast. But if code and data are separated,
+one must move to the other. Typically it is faster to ship serialized code from place to place than
+a chunk of data because code size is much smaller than data. Spark builds its scheduling around
+this general principle of data locality.
+
+Data locality is how close data is to the code processing it. There are several levels of
+locality based on the data's current location. In order from closest to farthest:
+
+- `PROCESS_LOCAL` data is in the same JVM as the running code. This is the best locality
+ possible
+- `NODE_LOCAL` data is on the same node. Examples might be in HDFS on the same node, or in
+ another executor on the same node. This is a little slower than `PROCESS_LOCAL` because the data
+ has to travel between processes
+- `NO_PREF` data is accessed equally quickly from anywhere and has no locality preference
+- `RACK_LOCAL` data is on the same rack of servers. Data is on a different server on the same rack
+ so needs to be sent over the network, typically through a single switch
+- `ANY` data is elsewhere on the network and not in the same rack
+
+Spark prefers to schedule all tasks at the best locality level, but this is not always possible. In
+situations where there is no unprocessed data on any idle executor, Spark switches to lower locality
+levels. There are two options: a) wait until a busy CPU frees up to start a task on data on the same
+server, or b) immediately start a new task in a farther away place that requires moving data there.
+
+What Spark typically does is wait a bit in the hopes that a busy CPU frees up. Once that timeout
+expires, it starts moving the data from far away to the free CPU. The wait timeout for fallback
+between each level can be configured individually or all together in one parameter; see the
+`spark.locality` parameters on the [configuration page](configuration.html#scheduling) for details.
+You should increase these settings if your tasks are long and see poor locality, but the default
+usually works well.
+
# Summary
This has been a short guide to point out the main concerns you should know about when tuning a