aboutsummaryrefslogtreecommitdiff
path: root/network
diff options
context:
space:
mode:
authorAnkur Dave <ankurdave@gmail.com>2014-11-11 23:38:27 -0800
committerReynold Xin <rxin@databricks.com>2014-11-11 23:38:27 -0800
commitfaeb41de215d3ac567ce72a43ab242ad433ca93e (patch)
tree36f408ec2e7a014ff07a2337e6939eb67ee7387c /network
parent2ef016b130a48869cf81fe6cf147ef2b1e79d674 (diff)
downloadspark-faeb41de215d3ac567ce72a43ab242ad433ca93e.tar.gz
spark-faeb41de215d3ac567ce72a43ab242ad433ca93e.tar.bz2
spark-faeb41de215d3ac567ce72a43ab242ad433ca93e.zip
[SPARK-3936] Add aggregateMessages, which supersedes mapReduceTriplets
aggregateMessages enables neighborhood computation similarly to mapReduceTriplets, but it introduces two API improvements: 1. Messages are sent using an imperative interface based on EdgeContext rather than by returning an iterator of messages. 2. Rather than attempting bytecode inspection, the required triplet fields must be explicitly specified by the user by passing a TripletFields object. This fixes SPARK-3936. Additionally, this PR includes the following optimizations for aggregateMessages and EdgePartition: 1. EdgePartition now stores local vertex ids instead of global ids. This avoids hash lookups when looking up vertex attributes and aggregating messages. 2. Internal iterators in aggregateMessages are inlined into a while loop. In total, these optimizations were tested to provide a 37% speedup on PageRank (uk-2007-05 graph, 10 iterations, 16 r3.2xlarge machines, sped up from 513 s to 322 s). Subsumes apache/spark#2815. Also fixes SPARK-4173. Author: Ankur Dave <ankurdave@gmail.com> Closes #3100 from ankurdave/aggregateMessages and squashes the following commits: f5b65d0 [Ankur Dave] Address @rxin comments on apache/spark#3054 and apache/spark#3100 1e80aca [Ankur Dave] Add aggregateMessages, which supersedes mapReduceTriplets 194a2df [Ankur Dave] Test triplet iterator in EdgePartition serialization test e0f8ecc [Ankur Dave] Take activeSet in ExistingEdgePartitionBuilder c85076d [Ankur Dave] Readability improvements b567be2 [Ankur Dave] iter.foreach -> while loop 4a566dc [Ankur Dave] Optimizations for mapReduceTriplets and EdgePartition
Diffstat (limited to 'network')
0 files changed, 0 insertions, 0 deletions