[SPARK-7997][CORE] Remove Akka from Spark Core and Streaming

- Remove Akka dependency from core. Note: the streaming-akka project still uses Akka. - Remove HttpFileServer - Remove Akka configs from SparkConf and SSLOptions - Rename `spark.akka.frameSize` to `spark.rpc.message.maxSize`. I think it's still worth to keep this config because using `DirectTaskResult` or `IndirectTaskResult` depends on it. - Update comments and docs Author: Shixiong Zhu <shixiong@databricks.com> Closes #10854 from zsxwing/remove-akka.
author: Shixiong Zhu <shixiong@databricks.com> 2016-01-22 21:20:04 -0800
committer: Reynold Xin <rxin@databricks.com> 2016-01-22 21:20:04 -0800
commit: bc1babd63da4ee56e6d371eb24805a5d714e8295 (patch)
tree: 8aec6a20e3d23574f53d818752df61a28c64d635 /docs
parent: d8fefab4d8149f0638282570c75271ef35c65cff (diff)
download: spark-bc1babd63da4ee56e6d371eb24805a5d714e8295.tar.gz
spark-bc1babd63da4ee56e6d371eb24805a5d714e8295.tar.bz2
spark-bc1babd63da4ee56e6d371eb24805a5d714e8295.zip
3 files changed, 9 insertions, 88 deletions
diff --git a/docs/cluster-overview.md b/docs/cluster-overview.md
index 2810112f52..814e4406cf 100644
--- a/docs/cluster-overview.md
+++ b/docs/cluster-overview.md
@@ -35,7 +35,7 @@ There are several useful things to note about this architecture:
    processes, and these communicate with each other, it is relatively easy to run it even on a
    cluster manager that also supports other applications (e.g. Mesos/YARN).
 3. The driver program must listen for and accept incoming connections from its executors throughout
-   its lifetime (e.g., see [spark.driver.port and spark.fileserver.port in the network config
+   its lifetime (e.g., see [spark.driver.port in the network config
    section](configuration.html#networking)). As such, the driver program must be network
    addressable from the worker nodes.
 4. Because the driver schedules tasks on the cluster, it should be run close to the worker
diff --git a/docs/configuration.md b/docs/configuration.md
index acaeb83008..d2a2f10524 100644
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -944,52 +944,12 @@ Apart from these, the following properties are also available, and may be useful
 <table class="table">
 <tr><th>Property Name</th><th>Default</th><th>Meaning</th></tr>
 <tr>
-  <td><code>spark.akka.frameSize</code></td>
+  <td><code>spark.rpc.message.maxSize</code></td>
   <td>128</td>
   <td>
     Maximum message size (in MB) to allow in "control plane" communication; generally only applies to map
     output size information sent between executors and the driver. Increase this if you are running
-    jobs with many thousands of map and reduce tasks and see messages about the frame size.
-  </td>
-</tr>
-<tr>
-  <td><code>spark.akka.heartbeat.interval</code></td>
-  <td>1000s</td>
-  <td>
-    This is set to a larger value to disable the transport failure detector that comes built in to
-    Akka. It can be enabled again, if you plan to use this feature (Not recommended). A larger
-    interval value reduces network overhead and a smaller value ( ~ 1 s) might be more
-    informative for Akka's failure detector. Tune this in combination of <code>spark.akka.heartbeat.pauses</code>
-    if you need to. A likely positive use case for using failure detector would be: a sensistive
-    failure detector can help evict rogue executors quickly. However this is usually not the case
-    as GC pauses and network lags are expected in a real Spark cluster. Apart from that enabling
-    this leads to a lot of exchanges of heart beats between nodes leading to flooding the network
-    with those.
-  </td>
-</tr>
-<tr>
-  <td><code>spark.akka.heartbeat.pauses</code></td>
-  <td>6000s</td>
-  <td>
-     This is set to a larger value to disable the transport failure detector that comes built in to Akka.
-     It can be enabled again, if you plan to use this feature (Not recommended). Acceptable heart
-     beat pause for Akka. This can be used to control sensitivity to GC pauses. Tune
-     this along with <code>spark.akka.heartbeat.interval</code> if you need to.
-  </td>
-</tr>
-<tr>
-  <td><code>spark.akka.threads</code></td>
-  <td>4</td>
-  <td>
-    Number of actor threads to use for communication. Can be useful to increase on large clusters
-    when the driver has a lot of CPU cores.
-  </td>
-</tr>
-<tr>
-  <td><code>spark.akka.timeout</code></td>
-  <td>100s</td>
-  <td>
-    Communication timeout between Spark nodes.
+    jobs with many thousands of map and reduce tasks and see messages about the RPC message size.
   </td>
 </tr>
 <tr>
@@ -1016,27 +976,11 @@ Apart from these, the following properties are also available, and may be useful
   </td>
 </tr>
 <tr>
-  <td><code>spark.executor.port</code></td>
-  <td>(random)</td>
-  <td>
-    Port for the executor to listen on. This is used for communicating with the driver.
-    This is only relevant when using the Akka RPC backend.
-  </td>
-</tr>
-<tr>
-  <td><code>spark.fileserver.port</code></td>
-  <td>(random)</td>
-  <td>
-    Port for the driver's HTTP file server to listen on.
-    This is only relevant when using the Akka RPC backend.
-  </td>
-</tr>
-<tr>
   <td><code>spark.network.timeout</code></td>
   <td>120s</td>
   <td>
     Default timeout for all network interactions. This config will be used in place of
-    <code>spark.core.connection.ack.wait.timeout</code>, <code>spark.akka.timeout</code>,
+    <code>spark.core.connection.ack.wait.timeout</code>,
     <code>spark.storage.blockManagerSlaveTimeoutMs</code>,
     <code>spark.shuffle.io.connectionTimeout</code>, <code>spark.rpc.askTimeout</code> or
     <code>spark.rpc.lookupTimeout</code> if they are not configured.
@@ -1418,8 +1362,7 @@ Apart from these, the following properties are also available, and may be useful
 
             <p>Use <code>spark.ssl.YYY.XXX</code> settings to overwrite the global configuration for
             particular protocol denoted by <code>YYY</code>. Currently <code>YYY</code> can be
-            either <code>akka</code> for Akka based connections or <code>fs</code> for file
-            server.</p>
+            only <code>fs</code> for file server.</p>
         </td>
     </tr>
     <tr>
diff --git a/docs/security.md b/docs/security.md
index a4cc0f42b2..32c33d2857 100644
--- a/docs/security.md
+++ b/docs/security.md
@@ -27,8 +27,7 @@ If your applications are using event logging, the directory where the event logs
 
 ## Encryption
 
-Spark supports SSL for Akka and HTTP protocols. SASL encryption is supported for the block transfer
-service.
+Spark supports SSL for HTTP protocols. SASL encryption is supported for the block transfer service.
 
 Encryption is not yet supported for data stored by Spark in temporary local storage, such as shuffle
 files, cached data, and other application files. If encrypting this data is desired, a workaround is
@@ -49,10 +48,6 @@ component-specific configuration namespaces used to override the default setting
     <th>Component</th>
   </tr>
   <tr>
-    <td><code>spark.ssl.akka</code></td>
-    <td>Akka communication channels</td>
-  </tr>
-  <tr>
     <td><code>spark.ssl.fs</code></td>
     <td>HTTP file server and broadcast server</td>
   </tr>
@@ -137,7 +132,7 @@ configure those ports.
     <td>7077</td>
     <td>Submit job to cluster /<br> Join cluster</td>
     <td><code>SPARK_MASTER_PORT</code></td>
-    <td>Akka-based. Set to "0" to choose a port randomly. Standalone mode only.</td>
+    <td>Set to "0" to choose a port randomly. Standalone mode only.</td>
   </tr>
   <tr>
     <td>Standalone Master</td>
@@ -145,7 +140,7 @@ configure those ports.
     <td>(random)</td>
     <td>Schedule executors</td>
     <td><code>SPARK_WORKER_PORT</code></td>
-    <td>Akka-based. Set to "0" to choose a port randomly. Standalone mode only.</td>
+    <td>Set to "0" to choose a port randomly. Standalone mode only.</td>
   </tr>
 </table>
 
@@ -178,24 +173,7 @@ configure those ports.
     <td>(random)</td>
     <td>Connect to application /<br> Notify executor state changes</td>
     <td><code>spark.driver.port</code></td>
-    <td>Akka-based. Set to "0" to choose a port randomly.</td>
-  </tr>
-  <tr>
-    <td>Driver</td>
-    <td>Executor</td>
-    <td>(random)</td>
-    <td>Schedule tasks</td>
-    <td><code>spark.executor.port</code></td>
-    <td>Akka-based. Set to "0" to choose a port randomly. Only used if Akka RPC backend is
-    configured.</td>
-  </tr>
-  <tr>
-    <td>Executor</td>
-    <td>Driver</td>
-    <td>(random)</td>
-    <td>File server for files and jars</td>
-    <td><code>spark.fileserver.port</code></td>
-    <td>Jetty-based. Only used if Akka RPC backend is configured.</td>
+    <td>Set to "0" to choose a port randomly.</td>
   </tr>
   <tr>
     <td>Executor / Driver</td>
author	Shixiong Zhu <shixiong@databricks.com>	2016-01-22 21:20:04 -0800
committer	Reynold Xin <rxin@databricks.com>	2016-01-22 21:20:04 -0800
commit	bc1babd63da4ee56e6d371eb24805a5d714e8295 (patch)
tree	8aec6a20e3d23574f53d818752df61a28c64d635 /docs
parent	d8fefab4d8149f0638282570c75271ef35c65cff (diff)
download	spark-bc1babd63da4ee56e6d371eb24805a5d714e8295.tar.gz spark-bc1babd63da4ee56e6d371eb24805a5d714e8295.tar.bz2 spark-bc1babd63da4ee56e6d371eb24805a5d714e8295.zip