diff options
author | Oscar D. Lara Yejas <odlaraye@oscars-mbp.usca.ibm.com> | 2016-04-27 15:47:54 -0700 |
---|---|---|
committer | Shivaram Venkataraman <shivaram@cs.berkeley.edu> | 2016-04-27 15:47:54 -0700 |
commit | e4bfb4aa7382cb9c5e4eb7e2211551d5da716a61 (patch) | |
tree | 58d4303824aca6fec6f9f6311f2dcf7a3cb1bd4e /sbin/stop-mesos-dispatcher.sh | |
parent | 37575115b98fdc9ebadb2ebcbcd9907a3af1076c (diff) | |
download | spark-e4bfb4aa7382cb9c5e4eb7e2211551d5da716a61.tar.gz spark-e4bfb4aa7382cb9c5e4eb7e2211551d5da716a61.tar.bz2 spark-e4bfb4aa7382cb9c5e4eb7e2211551d5da716a61.zip |
[SPARK-13436][SPARKR] Added parameter drop to subsetting operator [
Added parameter drop to subsetting operator [. This is useful to get a Column from a DataFrame, given its name. R supports it.
In R:
```
> name <- "Sepal_Length"
> class(iris[, name])
[1] "numeric"
```
Currently, in SparkR:
```
> name <- "Sepal_Length"
> class(irisDF[, name])
[1] "DataFrame"
```
Previous code returns a DataFrame, which is inconsistent with R's behavior. SparkR should return a Column instead. Currently, in order for the user to return a Column given a column name as a character variable would be through `eval(parse(x))`, where x is the string `"irisDF$Sepal_Length"`. That itself is pretty hacky. `SparkR:::getColumn() `is another choice, but I don't see why this method should be externalized. Instead, following R's way to do things, the proposed implementation allows this:
```
> name <- "Sepal_Length"
> class(irisDF[, name, drop=T])
[1] "Column"
> class(irisDF[, name, drop=F])
[1] "DataFrame"
```
This is consistent with R:
```
> name <- "Sepal_Length"
> class(iris[, name])
[1] "numeric"
> class(iris[, name, drop=F])
[1] "data.frame"
```
Author: Oscar D. Lara Yejas <odlaraye@oscars-mbp.usca.ibm.com>
Author: Oscar D. Lara Yejas <odlaraye@oscars-mbp.attlocal.net>
Closes #11318 from olarayej/SPARK-13436.
Diffstat (limited to 'sbin/stop-mesos-dispatcher.sh')
0 files changed, 0 insertions, 0 deletions