aboutsummaryrefslogtreecommitdiff
path: root/R/pkg/inst
diff options
context:
space:
mode:
authorDaoyuan Wang <daoyuan.wang@intel.com>2016-05-23 23:29:15 -0700
committerAndrew Or <andrew@databricks.com>2016-05-23 23:29:15 -0700
commitd642b273544bb77ef7f584326aa2d214649ac61b (patch)
treee2bf63cd2c378d285165a7bf5f829dad93322efe /R/pkg/inst
parentde726b0d533158d3ca08841bd6976bcfa26ca79d (diff)
downloadspark-d642b273544bb77ef7f584326aa2d214649ac61b.tar.gz
spark-d642b273544bb77ef7f584326aa2d214649ac61b.tar.bz2
spark-d642b273544bb77ef7f584326aa2d214649ac61b.zip
[SPARK-15397][SQL] fix string udf locate as hive
## What changes were proposed in this pull request? in hive, `locate("aa", "aaa", 0)` would yield 0, `locate("aa", "aaa", 1)` would yield 1 and `locate("aa", "aaa", 2)` would yield 2, while in Spark, `locate("aa", "aaa", 0)` would yield 1, `locate("aa", "aaa", 1)` would yield 2 and `locate("aa", "aaa", 2)` would yield 0. This results from the different understanding of the third parameter in udf `locate`. It means the starting index and starts from 1, so when we use 0, the return would always be 0. ## How was this patch tested? tested with modified `StringExpressionsSuite` and `StringFunctionsSuite` Author: Daoyuan Wang <daoyuan.wang@intel.com> Closes #13186 from adrian-wang/locate.
Diffstat (limited to 'R/pkg/inst')
-rw-r--r--R/pkg/inst/tests/testthat/test_sparkSQL.R2
1 files changed, 1 insertions, 1 deletions
diff --git a/R/pkg/inst/tests/testthat/test_sparkSQL.R b/R/pkg/inst/tests/testthat/test_sparkSQL.R
index 6a99b43e5a..b2d769f2ac 100644
--- a/R/pkg/inst/tests/testthat/test_sparkSQL.R
+++ b/R/pkg/inst/tests/testthat/test_sparkSQL.R
@@ -1152,7 +1152,7 @@ test_that("string operators", {
l2 <- list(list(a = "aaads"))
df2 <- createDataFrame(sqlContext, l2)
expect_equal(collect(select(df2, locate("aa", df2$a)))[1, 1], 1)
- expect_equal(collect(select(df2, locate("aa", df2$a, 1)))[1, 1], 2)
+ expect_equal(collect(select(df2, locate("aa", df2$a, 2)))[1, 1], 2)
expect_equal(collect(select(df2, lpad(df2$a, 8, "#")))[1, 1], "###aaads") # nolint
expect_equal(collect(select(df2, rpad(df2$a, 8, "#")))[1, 1], "aaads###") # nolint