aboutsummaryrefslogtreecommitdiff
path: root/R/pkg/inst/tests/testthat/test_context.R
diff options
context:
space:
mode:
authorTimothy Hunter <timhunter@databricks.com>2016-04-28 22:42:48 -0700
committerXiangrui Meng <meng@databricks.com>2016-04-28 22:42:48 -0700
commit769a909d1357766a441ff69e6e98c22c51b12c93 (patch)
treed176f05a13eec69224cf9e084706dd4fac9da1e8 /R/pkg/inst/tests/testthat/test_context.R
parent4607f6e7f7b174c62700f1fe542f77af3203b096 (diff)
downloadspark-769a909d1357766a441ff69e6e98c22c51b12c93.tar.gz
spark-769a909d1357766a441ff69e6e98c22c51b12c93.tar.bz2
spark-769a909d1357766a441ff69e6e98c22c51b12c93.zip
[SPARK-7264][ML] Parallel lapply for sparkR
## What changes were proposed in this pull request? This PR adds a new function in SparkR called `sparkLapply(list, function)`. This function implements a distributed version of `lapply` using Spark as a backend. TODO: - [x] check documentation - [ ] check tests Trivial example in SparkR: ```R sparkLapply(1:5, function(x) { 2 * x }) ``` Output: ``` [[1]] [1] 2 [[2]] [1] 4 [[3]] [1] 6 [[4]] [1] 8 [[5]] [1] 10 ``` Here is a slightly more complex example to perform distributed training of multiple models. Under the hood, Spark broadcasts the dataset. ```R library("MASS") data(menarche) families <- c("gaussian", "poisson") train <- function(family){glm(Menarche ~ Age , family=family, data=menarche)} results <- sparkLapply(families, train) ``` ## How was this patch tested? This PR was tested in SparkR. I am unfamiliar with R and SparkR, so any feedback on style, testing, etc. will be much appreciated. cc falaki davies Author: Timothy Hunter <timhunter@databricks.com> Closes #12426 from thunterdb/7264.
Diffstat (limited to 'R/pkg/inst/tests/testthat/test_context.R')
-rw-r--r--R/pkg/inst/tests/testthat/test_context.R6
1 files changed, 6 insertions, 0 deletions
diff --git a/R/pkg/inst/tests/testthat/test_context.R b/R/pkg/inst/tests/testthat/test_context.R
index ffa067eb5e..ca04342cd5 100644
--- a/R/pkg/inst/tests/testthat/test_context.R
+++ b/R/pkg/inst/tests/testthat/test_context.R
@@ -141,3 +141,9 @@ test_that("sparkJars sparkPackages as comma-separated strings", {
expect_that(processSparkJars(f), not(gives_warning()))
expect_match(processSparkJars(f), f)
})
+
+test_that("spark.lapply should perform simple transforms", {
+ sc <- sparkR.init()
+ doubled <- spark.lapply(sc, 1:10, function(x) { 2 * x })
+ expect_equal(doubled, as.list(2 * 1:10))
+})