diff options
author | Dongjoon Hyun <dongjoon@apache.org> | 2016-06-20 13:41:03 -0700 |
---|---|---|
committer | Shivaram Venkataraman <shivaram@cs.berkeley.edu> | 2016-06-20 13:41:03 -0700 |
commit | b0f2fb5b9729b38744bf784f2072f5ee52314f87 (patch) | |
tree | bbe026f28c48cdd9741016d0ac19d6abb1639df8 /docs/mllib-ensembles.md | |
parent | aee1420eca64dfc145f31b8c653388fafc5ccd8f (diff) | |
download | spark-b0f2fb5b9729b38744bf784f2072f5ee52314f87.tar.gz spark-b0f2fb5b9729b38744bf784f2072f5ee52314f87.tar.bz2 spark-b0f2fb5b9729b38744bf784f2072f5ee52314f87.zip |
[SPARK-16053][R] Add `spark_partition_id` in SparkR
## What changes were proposed in this pull request?
This PR adds `spark_partition_id` virtual column function in SparkR for API parity.
The following is just an example to illustrate a SparkR usage on a partitioned parquet table created by `spark.range(10).write.mode("overwrite").parquet("/tmp/t1")`.
```r
> collect(select(read.parquet('/tmp/t1'), c('id', spark_partition_id())))
id SPARK_PARTITION_ID()
1 3 0
2 4 0
3 8 1
4 9 1
5 0 2
6 1 3
7 2 4
8 5 5
9 6 6
10 7 7
```
## How was this patch tested?
Pass the Jenkins tests (including new testcase).
Author: Dongjoon Hyun <dongjoon@apache.org>
Closes #13768 from dongjoon-hyun/SPARK-16053.
Diffstat (limited to 'docs/mllib-ensembles.md')
0 files changed, 0 insertions, 0 deletions