[SPARK-16053][R] Add `spark_partition_id` in SparkR - spark

diff options

author	Dongjoon Hyun <dongjoon@apache.org>	2016-06-20 13:41:03 -0700
committer	Shivaram Venkataraman <shivaram@cs.berkeley.edu>	2016-06-20 13:41:03 -0700
commit	b0f2fb5b9729b38744bf784f2072f5ee52314f87 (patch)
tree	bbe026f28c48cdd9741016d0ac19d6abb1639df8 /docs/mllib-ensembles.md
parent	aee1420eca64dfc145f31b8c653388fafc5ccd8f (diff)
download	spark-b0f2fb5b9729b38744bf784f2072f5ee52314f87.tar.gz spark-b0f2fb5b9729b38744bf784f2072f5ee52314f87.tar.bz2 spark-b0f2fb5b9729b38744bf784f2072f5ee52314f87.zip

[SPARK-16053][R] Add `spark_partition_id` in SparkR

## What changes were proposed in this pull request? This PR adds `spark_partition_id` virtual column function in SparkR for API parity. The following is just an example to illustrate a SparkR usage on a partitioned parquet table created by `spark.range(10).write.mode("overwrite").parquet("/tmp/t1")`. ```r > collect(select(read.parquet('/tmp/t1'), c('id', spark_partition_id()))) id SPARK_PARTITION_ID() 1 3 0 2 4 0 3 8 1 4 9 1 5 0 2 6 1 3 7 2 4 8 5 5 9 6 6 10 7 7 ``` ## How was this patch tested? Pass the Jenkins tests (including new testcase). Author: Dongjoon Hyun <dongjoon@apache.org> Closes #13768 from dongjoon-hyun/SPARK-16053.

Diffstat (limited to 'docs/mllib-ensembles.md')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: