aboutsummaryrefslogtreecommitdiff
path: root/graphx/src
Commit message (Collapse)AuthorAgeFilesLines
...
* Do not re-use objects in the EdgePartition/EdgeTriplet iterators.Daniel Darabos2014-04-024-10/+58
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This avoids a silent data corruption issue (https://spark-project.atlassian.net/browse/SPARK-1188) and has no performance impact by my measurements. It also simplifies the code. As far as I can tell the object re-use was nothing but premature optimization. I did actual benchmarks for all the included changes, and there is no performance difference. I am not sure where to put the benchmarks. Does Spark not have a benchmark suite? This is an example benchmark I did: test("benchmark") { val builder = new EdgePartitionBuilder[Int] for (i <- (1 to 10000000)) { builder.add(i.toLong, i.toLong, i) } val p = builder.toEdgePartition p.map(_.attr + 1).iterator.toList } It ran for 10 seconds both before and after this change. Author: Daniel Darabos <darabos.daniel@gmail.com> Closes #276 from darabos/spark-1188 and squashes the following commits: 574302b [Daniel Darabos] Restore "manual" copying in EdgePartition.map(Iterator). Add comment to discourage novices like myself from trying to simplify the code. 4117a64 [Daniel Darabos] Revert EdgePartitionSuite. 4955697 [Daniel Darabos] Create a copy of the Edge objects in EdgeRDD.compute(). This avoids exposing the object re-use, while still enables the more efficient behavior for internal code. 4ec77f8 [Daniel Darabos] Add comments about object re-use to the affected functions. 2da5e87 [Daniel Darabos] Restore object re-use in EdgePartition. 0182f2b [Daniel Darabos] Do not re-use objects in the EdgePartition/EdgeTriplet iterators. This avoids a silent data corruption issue (SPARK-1188) and has no performance impact in my measurements. It also simplifies the code. c55f52f [Daniel Darabos] Tests that reproduce the problems from SPARK-1188.
* SPARK-1352 - Comment style single space before ending */ check.Prashant Sharma2014-03-301-1/+1
| | | | | | | | Author: Prashant Sharma <prashant.s@imaginea.com> Closes #261 from ScrapCodes/comment-style-check2 and squashes the following commits: 6cde61e [Prashant Sharma] comment style space before ending */ check.
* SPARK-1096, a space after comment start style checker.Prashant Sharma2014-03-286-9/+7
| | | | | | | | | | | | | Author: Prashant Sharma <prashant.s@imaginea.com> Closes #124 from ScrapCodes/SPARK-1096/scalastyle-comment-check and squashes the following commits: 214135a [Prashant Sharma] Review feedback. 5eba88c [Prashant Sharma] Fixed style checks for ///+ comments. e54b2f8 [Prashant Sharma] improved message, work around. 83e7144 [Prashant Sharma] removed dependency on scalastyle in plugin, since scalastyle sbt plugin already depends on the right version. Incase we update the plugin we will have to adjust our spark-style project to depend on right scalastyle version. 810a1d6 [Prashant Sharma] SPARK-1096, a space after comment style checker. ba33193 [Prashant Sharma] scala style as a project
* Spark 1095 : Adding explicit return types to all public methodsNirmalReddy2014-03-263-3/+4
| | | | | | | | | | | | | | Excluded those that are self-evident and the cases that are discussed in the mailing list. Author: NirmalReddy <nirmal_reddy2000@yahoo.com> Author: NirmalReddy <nirmal.reddy@imaginea.com> Closes #168 from NirmalReddy/Spark-1095 and squashes the following commits: ac54b29 [NirmalReddy] import misplaced 8c5ff3e [NirmalReddy] Changed syntax of unit returning methods 02d0778 [NirmalReddy] fixed explicit types in all the other packages 1c17773 [NirmalReddy] fixed explicit types in core package
* SPARK-1255: Allow user to pass Serializer object instead of class name for ↵Reynold Xin2014-03-164-32/+26
| | | | | | | | | | | | | | | | | | shuffle. This is more general than simply passing a string name and leaves more room for performance optimizations. Note that this is technically an API breaking change in the following two ways: 1. The shuffle serializer specification in ShuffleDependency now require an object instead of a String (of the class name), but I suspect nobody else in this world has used this API other than me in GraphX and Shark. 2. Serializer's in Spark from now on are required to be serializable. Author: Reynold Xin <rxin@apache.org> Closes #149 from rxin/serializer and squashes the following commits: 5acaccd [Reynold Xin] Properly call serializer's constructors. 2a8d75a [Reynold Xin] Added more documentation for the serializer option in ShuffleDependency. 7420185 [Reynold Xin] Allow user to pass Serializer object instead of class name for shuffle.
* SPARK-782 Clean up for ASM dependency.Patrick Wendell2014-03-091-2/+2
| | | | | | | | | | | | | | | | This makes two changes. 1) Spark uses the shaded version of asm that is (conveniently) published with Kryo. 2) Existing exclude rules around asm are updated to reflect the new groupId of `org.ow2.asm`. This made all of the old rules not work with newer Hadoop versions that pull in new asm versions. Author: Patrick Wendell <pwendell@gmail.com> Closes #100 from pwendell/asm and squashes the following commits: 9235f3f [Patrick Wendell] SPARK-782 Clean up for ASM dependency.
* Graph primitives2Semih Salihoglu2014-02-242-10/+183
| | | | | | | | | | | | | | | | | | Hi guys, I'm following Joey and Ankur's suggestions to add collectEdges and pickRandomVertex. I'm also adding the tests for collectEdges and refactoring one method getCycleGraph in GraphOpsSuite.scala. Thank you, semih Author: Semih Salihoglu <semihsalihoglu@gmail.com> Closes #580 from semihsalihoglu/GraphPrimitives2 and squashes the following commits: 937d3ec [Semih Salihoglu] - Fixed the scalastyle errors. a69a152 [Semih Salihoglu] - Adding collectEdges and pickRandomVertices. - Adding tests for collectEdges. - Refactoring a getCycle utility function for GraphOpsSuite.scala. 41265a6 [Semih Salihoglu] - Adding collectEdges and pickRandomVertex. - Adding tests for collectEdges. - Recycling a getCycle utility test file.
* Merge pull request #567 from ScrapCodes/style2.Prashant Sharma2014-02-091-2/+1
| | | | | | | | | | | | | | | | SPARK-1058, Fix Style Errors and Add Scala Style to Spark Build. Pt 2 Continuation of PR #557 With this all scala style errors are fixed across the code base !! The reason for creating a separate PR was to not interrupt an already reviewed and ready to merge PR. Hope this gets reviewed soon and merged too. Author: Prashant Sharma <prashant.s@imaginea.com> Closes #567 and squashes the following commits: 3b1ec30 [Prashant Sharma] scala style fixes
* Merge pull request #557 from ScrapCodes/style. Closes #557.Patrick Wendell2014-02-096-44/+52
| | | | | | | | | | | | | | | | | | | | | SPARK-1058, Fix Style Errors and Add Scala Style to Spark Build. Author: Patrick Wendell <pwendell@gmail.com> Author: Prashant Sharma <scrapcodes@gmail.com> == Merge branch commits == commit 1a8bd1c059b842cb95cc246aaea74a79fec684f4 Author: Prashant Sharma <scrapcodes@gmail.com> Date: Sun Feb 9 17:39:07 2014 +0530 scala style fixes commit f91709887a8e0b608c5c2b282db19b8a44d53a43 Author: Patrick Wendell <pwendell@gmail.com> Date: Fri Jan 24 11:22:53 2014 -0800 Adding scalastyle snapshot
* Replace commons-math with jblasJianping J Wang2014-01-231-32/+36
|
* Depend on Commons Math explicitly instead of accidentally getting it from ↵Sean Owen2014-01-221-1/+1
| | | | Hadoop (which stops working in 2.2.x) and also use the newer commons-math3
* Merge pull request #436 from ankurdave/VertexId-caseReynold Xin2014-01-1432-209/+209
|\ | | | | | | Rename VertexID -> VertexId in GraphX
| * VertexID -> VertexIdAnkur Dave2014-01-1432-209/+209
| |
* | Fixed SVDPlusPlusSuite in Maven build.Reynold Xin2014-01-142-7/+19
|/
* Add missing header filesPatrick Wendell2014-01-1443-0/+731
|
* Adding minimal additional functionality to EdgeRDDJoseph E. Gonzalez2014-01-131-0/+17
|
* Fix bug in GraphLoader.edgeListFile that caused srcId > dstIdAnkur Dave2014-01-131-1/+1
|
* Edge object must be public for Edge case classAnkur Dave2014-01-131-2/+2
|
* Improve scaladoc linksAnkur Dave2014-01-132-6/+6
|
* Fix infinite loop in GraphGenerators.generateRandomEdgesAnkur Dave2014-01-131-8/+1
| | | | | The loop occurred when numEdges < numVertices. This commit fixes it by allowing generateRandomEdges to generate a multigraph.
* Make Graph{,Impl,Ops} serializable to work around captureAnkur Dave2014-01-133-3/+3
|
* Remove Graph.statistics and GraphImpl.printLineageAnkur Dave2014-01-133-77/+1
|
* Updated doc for PageRank.Reynold Xin2014-01-131-47/+39
|
* More cleanup.Reynold Xin2014-01-134-9/+10
|
* Moved SVDPlusPlusConf into SVDPlusPlus object itself.Reynold Xin2014-01-132-15/+17
|
* Moved PartitionStrategy's into an object.Reynold Xin2014-01-134-81/+85
|
* Updated GraphGenerator.Reynold Xin2014-01-131-30/+30
|
* Made more things private.Reynold Xin2014-01-1310-12/+26
|
* Merge branch 'graphx' of github.com:ankurdave/incubator-spark into graphxReynold Xin2014-01-1312-137/+70
|\ | | | | | | | | Conflicts: graphx/src/main/scala/org/apache/spark/graphx/Pregel.scala
| * Improvements in example code for the programming guide as well as adding ↵Joseph E. Gonzalez2014-01-131-0/+3
| | | | | | | | serialization support for GraphImpl to address issues with failed closure capture.
| * Add EdgeDirection.Either and use it to fix CC bugAnkur Dave2014-01-1312-54/+64
| | | | | | | | | | | | | | | | | | The bug was due to a misunderstanding of the activeSetOpt parameter to Graph.mapReduceTriplets. Passing EdgeDirection.Both causes mapReduceTriplets to run only on edges with *both* vertices in the active set. This commit adds EdgeDirection.Either, which causes mapReduceTriplets to run on edges with *either* vertex in the active set. This is what connected components needed.
| * Remove aggregateNeighborsAnkur Dave2014-01-132-85/+5
| |
* | Miscel doc update.Reynold Xin2014-01-1317-143/+158
|/
* Merge pull request #2 from jegonzal/GraphXCCIssueAnkur Dave2014-01-133-16/+62
|\ | | | | Improving documentation and identifying potential bug in CC calculation.
| * Improving documentation and identifying potential bug in CC calculation.Joseph E. Gonzalez2014-01-133-16/+62
| |
* | Improve EdgeRDD scaladocAnkur Dave2014-01-131-2/+11
| |
* | Further improve VertexRDD scaladocsAnkur Dave2014-01-131-14/+25
|/
* Move algorithms to GraphOpsAnkur Dave2014-01-124-78/+51
|
* Add TriangleCount exampleAnkur Dave2014-01-121-3/+2
|
* adding Pregel as an operator in GraphOps and cleaning up documentation of ↵Joseph E. Gonzalez2014-01-122-22/+74
| | | | GraphOps
* Add PageRank example and dataAnkur Dave2014-01-121-1/+1
|
* Link methods in programming guide; document VertexIDAnkur Dave2014-01-121-0/+4
|
* Make EdgeDirection val instead of case object for Java compat.Ankur Dave2014-01-113-5/+15
|
* Use SparkConf in GraphX tests (via LocalSparkContext)Ankur Dave2014-01-111-5/+5
|
* One-line Scaladoc comments in Edge and EdgeDirectionAnkur Dave2014-01-112-22/+10
|
* Fix indent and use SparkConf in AnalyticsAnkur Dave2014-01-111-130/+115
|
* Remove GraphLabAnkur Dave2014-01-113-163/+34
|
* Make nullValue and VertexSet package-privateAnkur Dave2014-01-111-3/+3
|
* algorithms -> libAnkur Dave2014-01-1114-22/+22
|
* Optimize Edge.lexicographicOrderingAnkur Dave2014-01-111-1/+1
|