From b0f2fb5b9729b38744bf784f2072f5ee52314f87 Mon Sep 17 00:00:00 2001
From: Dongjoon Hyun <dongjoon@apache.org>
Date: Mon, 20 Jun 2016 13:41:03 -0700
Subject: [SPARK-16053][R] Add `spark_partition_id` in SparkR

## What changes were proposed in this pull request?

This PR adds `spark_partition_id` virtual column function in SparkR for API parity.

The following is just an example to illustrate a SparkR usage on a partitioned parquet table created by `spark.range(10).write.mode("overwrite").parquet("/tmp/t1")`.
```r
> collect(select(read.parquet('/tmp/t1'), c('id', spark_partition_id())))
   id SPARK_PARTITION_ID()
1   3                    0
2   4                    0
3   8                    1
4   9                    1
5   0                    2
6   1                    3
7   2                    4
8   5                    5
9   6                    6
10  7                    7
```

## How was this patch tested?

Pass the Jenkins tests (including new testcase).

Author: Dongjoon Hyun <dongjoon@apache.org>

Closes #13768 from dongjoon-hyun/SPARK-16053.
---
 R/pkg/NAMESPACE | 1 +
 1 file changed, 1 insertion(+)

(limited to 'R/pkg/NAMESPACE')

diff --git a/R/pkg/NAMESPACE b/R/pkg/NAMESPACE
index aaeab665a4..45663f4c2c 100644
--- a/R/pkg/NAMESPACE
+++ b/R/pkg/NAMESPACE
@@ -260,6 +260,7 @@ exportMethods("%in%",
               "skewness",
               "sort_array",
               "soundex",
+              "spark_partition_id",
               "stddev",
               "stddev_pop",
               "stddev_samp",
-- 
cgit v1.2.3