[SPARK-13436][SPARKR] Added parameter drop to subsetting operator [ - spark

diff options

author	Oscar D. Lara Yejas <odlaraye@oscars-mbp.usca.ibm.com>	2016-04-27 15:47:54 -0700
committer	Shivaram Venkataraman <shivaram@cs.berkeley.edu>	2016-04-27 15:47:54 -0700
commit	e4bfb4aa7382cb9c5e4eb7e2211551d5da716a61 (patch)
tree	58d4303824aca6fec6f9f6311f2dcf7a3cb1bd4e /sbin/stop-mesos-dispatcher.sh
parent	37575115b98fdc9ebadb2ebcbcd9907a3af1076c (diff)
download	spark-e4bfb4aa7382cb9c5e4eb7e2211551d5da716a61.tar.gz spark-e4bfb4aa7382cb9c5e4eb7e2211551d5da716a61.tar.bz2 spark-e4bfb4aa7382cb9c5e4eb7e2211551d5da716a61.zip

[SPARK-13436][SPARKR] Added parameter drop to subsetting operator [

Added parameter drop to subsetting operator [. This is useful to get a Column from a DataFrame, given its name. R supports it. In R: ``` > name <- "Sepal_Length" > class(iris[, name]) [1] "numeric" ``` Currently, in SparkR: ``` > name <- "Sepal_Length" > class(irisDF[, name]) [1] "DataFrame" ``` Previous code returns a DataFrame, which is inconsistent with R's behavior. SparkR should return a Column instead. Currently, in order for the user to return a Column given a column name as a character variable would be through `eval(parse(x))`, where x is the string `"irisDF$Sepal_Length"`. That itself is pretty hacky. `SparkR:::getColumn() `is another choice, but I don't see why this method should be externalized. Instead, following R's way to do things, the proposed implementation allows this: ``` > name <- "Sepal_Length" > class(irisDF[, name, drop=T]) [1] "Column" > class(irisDF[, name, drop=F]) [1] "DataFrame" ``` This is consistent with R: ``` > name <- "Sepal_Length" > class(iris[, name]) [1] "numeric" > class(iris[, name, drop=F]) [1] "data.frame" ``` Author: Oscar D. Lara Yejas <odlaraye@oscars-mbp.usca.ibm.com> Author: Oscar D. Lara Yejas <odlaraye@oscars-mbp.attlocal.net> Closes #11318 from olarayej/SPARK-13436.

Diffstat (limited to 'sbin/stop-mesos-dispatcher.sh')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: