aboutsummaryrefslogtreecommitdiff
path: root/R/README.md
diff options
context:
space:
mode:
authorhyukjinkwon <gurwls223@gmail.com>2016-05-23 17:20:29 -0700
committerShivaram Venkataraman <shivaram@cs.berkeley.edu>2016-05-23 17:20:29 -0700
commita8e97d17b91684e68290d9f18a43622232aa94e7 (patch)
tree6eda498a8c24f59e3863bd14a31ef85df4677cf1 /R/README.md
parent03c7b7c4b9374f0cb6a29aeaf495bd21c2563de4 (diff)
downloadspark-a8e97d17b91684e68290d9f18a43622232aa94e7.tar.gz
spark-a8e97d17b91684e68290d9f18a43622232aa94e7.tar.bz2
spark-a8e97d17b91684e68290d9f18a43622232aa94e7.zip
[MINOR][SPARKR][DOC] Add a description for running unit tests in Windows
## What changes were proposed in this pull request? This PR adds the description for running unit tests in Windows. ## How was this patch tested? On a bare machine (Window 7, 32bits), this was manually built and tested. Author: hyukjinkwon <gurwls223@gmail.com> Closes #13217 from HyukjinKwon/minor-r-doc.
Diffstat (limited to 'R/README.md')
-rw-r--r--R/README.md8
1 files changed, 7 insertions, 1 deletions
diff --git a/R/README.md b/R/README.md
index 810bfc14e9..044f95312a 100644
--- a/R/README.md
+++ b/R/README.md
@@ -1,11 +1,13 @@
# R on Spark
SparkR is an R package that provides a light-weight frontend to use Spark from R.
+
### Installing sparkR
Libraries of sparkR need to be created in `$SPARK_HOME/R/lib`. This can be done by running the script `$SPARK_HOME/R/install-dev.sh`.
By default the above script uses the system wide installation of R. However, this can be changed to any user installed location of R by setting the environment variable `R_HOME` the full path of the base directory where R is installed, before running install-dev.sh script.
Example:
+
```
# where /home/username/R is where R is installed and /home/username/R/bin contains the files R and RScript
export R_HOME=/home/username/R
@@ -17,6 +19,7 @@ export R_HOME=/home/username/R
#### Build Spark
Build Spark with [Maven](http://spark.apache.org/docs/latest/building-spark.html#building-with-buildmvn) and include the `-Psparkr` profile to build the R package. For example to use the default Hadoop versions you can run
+
```
build/mvn -DskipTests -Psparkr package
```
@@ -38,6 +41,7 @@ To set other options like driver memory, executor memory etc. you can pass in th
#### Using SparkR from RStudio
If you wish to use SparkR from RStudio or other R frontends you will need to set some environment variables which point SparkR to your Spark installation. For example
+
```
# Set this to where Spark is installed
Sys.setenv(SPARK_HOME="/Users/username/spark")
@@ -64,13 +68,15 @@ To run one of them, use `./bin/spark-submit <filename> <args>`. For example:
./bin/spark-submit examples/src/main/r/dataframe.R
-You can also run the unit-tests for SparkR by running (you need to install the [testthat](http://cran.r-project.org/web/packages/testthat/index.html) package first):
+You can also run the unit tests for SparkR by running. You need to install the [testthat](http://cran.r-project.org/web/packages/testthat/index.html) package first:
R -e 'install.packages("testthat", repos="http://cran.us.r-project.org")'
./R/run-tests.sh
### Running on YARN
+
The `./bin/spark-submit` can also be used to submit jobs to YARN clusters. You will need to set YARN conf dir before doing so. For example on CDH you can run
+
```
export YARN_CONF_DIR=/etc/hadoop/conf
./bin/spark-submit --master yarn examples/src/main/r/dataframe.R