aboutsummaryrefslogtreecommitdiff
path: root/graphx
diff options
context:
space:
mode:
authorBrennon York <brennon.york@capitalone.com>2015-03-13 18:48:31 +0000
committerSean Owen <sowen@cloudera.com>2015-03-13 18:48:31 +0000
commitb943f5d907df0607ecffb729f2bccfa436438d7e (patch)
tree8f420c83bd960b8ee0befb66fc71efd698122b25 /graphx
parent7f13434a5c52b815c584ec773ab0e5df1a35ea86 (diff)
downloadspark-b943f5d907df0607ecffb729f2bccfa436438d7e.tar.gz
spark-b943f5d907df0607ecffb729f2bccfa436438d7e.tar.bz2
spark-b943f5d907df0607ecffb729f2bccfa436438d7e.zip
[SPARK-4600][GraphX]: org.apache.spark.graphx.VertexRDD.diff does not work
Turns out, per the [convo on the JIRA](https://issues.apache.org/jira/browse/SPARK-4600), `diff` is acting exactly as should. It became a large misconception as I thought it meant set difference, when in fact it does not. To that extent I merely updated the `diff` documentation to, hopefully, better reflect its true intentions moving forward. Author: Brennon York <brennon.york@capitalone.com> Closes #5015 from brennonyork/SPARK-4600 and squashes the following commits: 1e1d1e5 [Brennon York] reverted internal diff docs 92288f7 [Brennon York] reverted both the test suite and the diff function back to its origin functionality f428623 [Brennon York] updated diff documentation to better represent its function cc16d65 [Brennon York] Merge remote-tracking branch 'upstream/master' into SPARK-4600 66818b9 [Brennon York] added small secondary diff test 99ad412 [Brennon York] Merge remote-tracking branch 'upstream/master' into SPARK-4600 74b8c95 [Brennon York] corrected method by leveraging bitmask operations to correctly return only the portions of that are different from the calling VertexRDD 9717120 [Brennon York] updated diff impl to cause fewer objects to be created 710a21c [Brennon York] working diff given test case aa57f83 [Brennon York] updated to set ShortestPaths to run 'forward' rather than 'backward'
Diffstat (limited to 'graphx')
-rw-r--r--graphx/src/main/scala/org/apache/spark/graphx/VertexRDD.scala7
1 files changed, 5 insertions, 2 deletions
diff --git a/graphx/src/main/scala/org/apache/spark/graphx/VertexRDD.scala b/graphx/src/main/scala/org/apache/spark/graphx/VertexRDD.scala
index 09ae3f9f6c..40ecff7107 100644
--- a/graphx/src/main/scala/org/apache/spark/graphx/VertexRDD.scala
+++ b/graphx/src/main/scala/org/apache/spark/graphx/VertexRDD.scala
@@ -122,8 +122,11 @@ abstract class VertexRDD[VD](
def mapValues[VD2: ClassTag](f: (VertexId, VD) => VD2): VertexRDD[VD2]
/**
- * Hides vertices that are the same between `this` and `other`; for vertices that are different,
- * keeps the values from `other`.
+ * For each vertex present in both `this` and `other`, `diff` returns only those vertices with
+ * differing values; for values that are different, keeps the values from `other`. This is
+ * only guaranteed to work if the VertexRDDs share a common ancestor.
+ *
+ * @param other the other VertexRDD with which to diff against.
*/
def diff(other: VertexRDD[VD]): VertexRDD[VD]