aboutsummaryrefslogtreecommitdiff
path: root/R/pkg
diff options
context:
space:
mode:
authoranabranch <wac.chambers@gmail.com>2016-11-17 11:34:55 +0000
committerSean Owen <sowen@cloudera.com>2016-11-17 11:34:55 +0000
commit49b6f456aca350e9e2c170782aa5cc75e7822680 (patch)
tree3a13f932b73feeab6b01f1d039728758203edcf0 /R/pkg
parenta3cac7bd86a6fe8e9b42da1bf580aaeb59378304 (diff)
downloadspark-49b6f456aca350e9e2c170782aa5cc75e7822680.tar.gz
spark-49b6f456aca350e9e2c170782aa5cc75e7822680.tar.bz2
spark-49b6f456aca350e9e2c170782aa5cc75e7822680.zip
[SPARK-18365][DOCS] Improve Sample Method Documentation
## What changes were proposed in this pull request? I found the documentation for the sample method to be confusing, this adds more clarification across all languages. - [x] Scala - [x] Python - [x] R - [x] RDD Scala - [ ] RDD Python with SEED - [X] RDD Java - [x] RDD Java with SEED - [x] RDD Python ## How was this patch tested? NA Please review https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark before opening a pull request. Author: anabranch <wac.chambers@gmail.com> Author: Bill Chambers <bill@databricks.com> Closes #15815 from anabranch/SPARK-18365.
Diffstat (limited to 'R/pkg')
-rw-r--r--R/pkg/R/DataFrame.R4
1 files changed, 3 insertions, 1 deletions
diff --git a/R/pkg/R/DataFrame.R b/R/pkg/R/DataFrame.R
index 1cf9b38ea6..4e3d97bb3a 100644
--- a/R/pkg/R/DataFrame.R
+++ b/R/pkg/R/DataFrame.R
@@ -936,7 +936,9 @@ setMethod("unique",
#' Sample
#'
-#' Return a sampled subset of this SparkDataFrame using a random seed.
+#' Return a sampled subset of this SparkDataFrame using a random seed.
+#' Note: this is not guaranteed to provide exactly the fraction specified
+#' of the total count of of the given SparkDataFrame.
#'
#' @param x A SparkDataFrame
#' @param withReplacement Sampling with replacement or not