aboutsummaryrefslogtreecommitdiff
path: root/docs
diff options
context:
space:
mode:
authorSun Rui <rui.sun@intel.com>2015-10-23 21:38:04 -0700
committerShivaram Venkataraman <shivaram@cs.berkeley.edu>2015-10-23 21:38:04 -0700
commit2462dbcce89d657bca17ae311c99c2a4bee4a5fa (patch)
tree5b0930e12edaa40510f2dc9fd7b4f0a92a944ccb /docs
parent4725cb988b98f367c07214c4c3cfd1206fb2b5c2 (diff)
downloadspark-2462dbcce89d657bca17ae311c99c2a4bee4a5fa.tar.gz
spark-2462dbcce89d657bca17ae311c99c2a4bee4a5fa.tar.bz2
spark-2462dbcce89d657bca17ae311c99c2a4bee4a5fa.zip
[SPARK-10971][SPARKR] RRunner should allow setting path to Rscript.
Add a new spark conf option "spark.sparkr.r.driver.command" to specify the executable for an R script in client modes. The existing spark conf option "spark.sparkr.r.command" is used to specify the executable for an R script in cluster modes for both driver and workers. See also [launch R worker script](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/api/r/RRDD.scala#L395). BTW, [envrionment variable "SPARKR_DRIVER_R"](https://github.com/apache/spark/blob/master/launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java#L275) is used to locate R shell on the local host. For your information, PYSPARK has two environment variables serving simliar purpose: PYSPARK_PYTHON Python binary executable to use for PySpark in both driver and workers (default is `python`). PYSPARK_DRIVER_PYTHON Python binary executable to use for PySpark in driver only (default is PYSPARK_PYTHON). pySpark use the code [here](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/PythonRunner.scala#L41) to determine the python executable for a python script. Author: Sun Rui <rui.sun@intel.com> Closes #9179 from sun-rui/SPARK-10971.
Diffstat (limited to 'docs')
-rw-r--r--docs/configuration.md18
1 files changed, 18 insertions, 0 deletions
diff --git a/docs/configuration.md b/docs/configuration.md
index be9c36bdfe..682384d424 100644
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -1589,6 +1589,20 @@ Apart from these, the following properties are also available, and may be useful
Number of threads used by RBackend to handle RPC calls from SparkR package.
</td>
</tr>
+<tr>
+ <td><code>spark.r.command</code></td>
+ <td>Rscript</td>
+ <td>
+ Executable for executing R scripts in cluster modes for both driver and workers.
+ </td>
+</tr>
+<tr>
+ <td><code>spark.r.driver.command</code></td>
+ <td>spark.r.command</td>
+ <td>
+ Executable for executing R scripts in client modes for driver. Ignored in cluster modes.
+ </td>
+</tr>
</table>
#### Cluster Managers
@@ -1629,6 +1643,10 @@ The following variables can be set in `spark-env.sh`:
<td>Python binary executable to use for PySpark in driver only (default is <code>PYSPARK_PYTHON</code>).</td>
</tr>
<tr>
+ <td><code>SPARKR_DRIVER_R</code></td>
+ <td>R binary executable to use for SparkR shell (default is <code>R</code>).</td>
+ </tr>
+ <tr>
<td><code>SPARK_LOCAL_IP</code></td>
<td>IP address of the machine to bind to.</td>
</tr>