aboutsummaryrefslogtreecommitdiff
path: root/docs/graphx-programming-guide.md
Commit message (Collapse)AuthorAgeFilesLines
* [SPARK-6510][GraphX]: Add Graph#minus method to act as Set#differenceBrennon York2015-03-261-0/+2
| | | | | | | | | | | | | | | | | | | | | Adds a `Graph#minus` method which will return only unique `VertexId`'s from the calling `VertexRDD`. To demonstrate a basic example with pseudocode: ``` Set((0L,0),(1L,1)).minus(Set((1L,1),(2L,2))) > Set((0L,0)) ``` Author: Brennon York <brennon.york@capitalone.com> Closes #5175 from brennonyork/SPARK-6510 and squashes the following commits: 248d5c8 [Brennon York] added minus(VertexRDD[VD]) method to avoid createUsingIndex and updated the mask operations to simplify with andNot call 3fb7cce [Brennon York] updated graphx doc to reflect the addition of minus method 6575d92 [Brennon York] updated mima exclude aaa030b [Brennon York] completed graph#minus functionality 7227c0f [Brennon York] beginning work on minus functionality
* aggregateMessages example in graphX docDEBORAH SIEGEL2015-03-021-2/+2
| | | | | | | | | | | | Examples illustrating difference between legacy mapReduceTriplets usage and aggregateMessages usage has type issues on the reduce for both operators. Being just an example- changed example to reduce the message String by concatenation. Although non-optimal for performance. Author: DEBORAH SIEGEL <deborahsiegel@DEBORAHs-MacBook-Pro.local> Closes #4853 from d3borah/master and squashes the following commits: db54173 [DEBORAH SIEGEL] fixed aggregateMessages example in graphX doc
* [GraphX] fixing 3 typos in the graphx programming guideBenedikt Linse2015-02-251-4/+4
| | | | | | | | | | Corrected 3 Typos in the GraphX programming guide. I hope this is the correct way to contribute. Author: Benedikt Linse <benedikt.linse@gmail.com> Closes #4766 from 1123/master and squashes the following commits: 8a63812 [Benedikt Linse] fixing 3 typos in the graphx programming guide
* [SPARK-5608] Improve SEO of Spark documentation pagesMatei Zaharia2015-02-051-1/+3
| | | | | | | | | | | | | | - Add meta description tags on some of the most important doc pages - Shorten the titles of some pages to have more relevant keywords; for example there's no reason to have "Spark SQL Programming Guide - Spark 1.2.0 documentation", we can just say "Spark SQL - Spark 1.2.0 documentation". Author: Matei Zaharia <matei@databricks.com> Closes #4381 from mateiz/docs-seo and squashes the following commits: 4940563 [Matei Zaharia] [SPARK-5608] Improve SEO of Spark documentation pages
* [Doc][GraphX] Remove Motivation section and did some minor update.Reynold Xin2014-11-211-70/+7
|
* Updating GraphX programming guide and documentationJoseph E. Gonzalez2014-11-191-144/+216
| | | | | | | | | | This pull request revises the programming guide to reflect changes in the GraphX API as well as the deprecated mapReduceTriplets operator. Author: Joseph E. Gonzalez <joseph.e.gonzalez@gmail.com> Closes #3359 from jegonzal/GraphXProgrammingGuide and squashes the following commits: 4421964 [Joseph E. Gonzalez] updating documentation for graphx
* [SPARK-1566] consolidate programming guide, and general doc updatesMatei Zaharia2014-05-301-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a fairly large PR to clean up and update the docs for 1.0. The major changes are: * A unified programming guide for all languages replaces language-specific ones and shows language-specific info in tabs * New programming guide sections on key-value pairs, unit testing, input formats beyond text, migrating from 0.9, and passing functions to Spark * Spark-submit guide moved to a separate page and expanded slightly * Various cleanups of the menu system, security docs, and others * Updated look of title bar to differentiate the docs from previous Spark versions You can find the updated docs at http://people.apache.org/~matei/1.0-docs/_site/ and in particular http://people.apache.org/~matei/1.0-docs/_site/programming-guide.html. Author: Matei Zaharia <matei@databricks.com> Closes #896 from mateiz/1.0-docs and squashes the following commits: 03e6853 [Matei Zaharia] Some tweaks to configuration and YARN docs 0779508 [Matei Zaharia] tweak ef671d4 [Matei Zaharia] Keep frames in JavaDoc links, and other small tweaks 1bf4112 [Matei Zaharia] Review comments 4414f88 [Matei Zaharia] tweaks d04e979 [Matei Zaharia] Fix some old links to Java guide a34ed33 [Matei Zaharia] tweak 541bb3b [Matei Zaharia] miscellaneous changes fcefdec [Matei Zaharia] Moved submitting apps to separate doc 61d72b4 [Matei Zaharia] stuff 181f217 [Matei Zaharia] migration guide, remove old language guides e11a0da [Matei Zaharia] Add more API functions 6a030a9 [Matei Zaharia] tweaks 8db0ae3 [Matei Zaharia] Added key-value pairs section 318d2c9 [Matei Zaharia] tweaks 1c81477 [Matei Zaharia] New section on basics and function syntax e38f559 [Matei Zaharia] Actually added programming guide to Git a33d6fe [Matei Zaharia] First pass at updating programming guide to support all languages, plus other tweaks throughout 3b6a876 [Matei Zaharia] More CSS tweaks 01ec8bf [Matei Zaharia] More CSS tweaks e6d252e [Matei Zaharia] Change color of doc title bar to differentiate from 0.9.0
* Unify GraphImpl RDDs + other graph load optimizationsAnkur Dave2014-05-101-8/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This PR makes the following changes, primarily in e4fbd329aef85fe2c38b0167255d2a712893d683: 1. *Unify RDDs to avoid zipPartitions.* A graph used to be four RDDs: vertices, edges, routing table, and triplet view. This commit merges them down to two: vertices (with routing table), and edges (with replicated vertices). 2. *Avoid duplicate shuffle in graph building.* We used to do two shuffles when building a graph: one to extract routing information from the edges and move it to the vertices, and another to find nonexistent vertices referred to by edges. With this commit, the latter is done as a side effect of the former. 3. *Avoid no-op shuffle when joins are fully eliminated.* This is a side effect of unifying the edges and the triplet view. 4. *Join elimination for mapTriplets.* 5. *Ship only the needed vertex attributes when upgrading the triplet view.* If the triplet view already contains source attributes, and we now need both attributes, only ship destination attributes rather than re-shipping both. This is done in `ReplicatedVertexView#upgrade`. Author: Ankur Dave <ankurdave@gmail.com> Closes #497 from ankurdave/unify-rdds and squashes the following commits: 332ab43 [Ankur Dave] Merge remote-tracking branch 'apache-spark/master' into unify-rdds 4933e2e [Ankur Dave] Exclude RoutingTable from binary compatibility check 5ba8789 [Ankur Dave] Add GraphX upgrade guide from Spark 0.9.1 13ac845 [Ankur Dave] Merge remote-tracking branch 'apache-spark/master' into unify-rdds a04765c [Ankur Dave] Remove unnecessary toOps call 57202e8 [Ankur Dave] Replace case with pair parameter 75af062 [Ankur Dave] Add explicit return types 04d3ae5 [Ankur Dave] Convert implicit parameter to context bound c88b269 [Ankur Dave] Revert upgradeIterator to if-in-a-loop 0d3584c [Ankur Dave] EdgePartition.size should be val 2a928b2 [Ankur Dave] Set locality wait 10b3596 [Ankur Dave] Clean up public API ae36110 [Ankur Dave] Fix style errors e4fbd32 [Ankur Dave] Unify GraphImpl RDDs + other graph load optimizations d6d60e2 [Ankur Dave] In GraphLoader, coalesce to minEdgePartitions 62c7b78 [Ankur Dave] In Analytics, take PageRank numIter d64e8d4 [Ankur Dave] Log current Pregel iteration
* [SPARK-1439, SPARK-1440] Generate unified Scaladoc across projects and JavadocsMatei Zaharia2014-04-211-31/+31
| | | | | | | | | | | | | | | | | | | | | | I used the sbt-unidoc plugin (https://github.com/sbt/sbt-unidoc) to create a unified Scaladoc of our public packages, and generate Javadocs as well. One limitation is that I haven't found an easy way to exclude packages in the Javadoc; there is a SBT task that identifies Java sources to run javadoc on, but it's been very difficult to modify it from outside to change what is set in the unidoc package. Some SBT-savvy people should help with this. The Javadoc site also lacks package-level descriptions and things like that, so we may want to look into that. We may decide not to post these right now if it's too limited compared to the Scala one. Example of the built doc site: http://people.csail.mit.edu/matei/spark-unified-docs/ Author: Matei Zaharia <matei@databricks.com> This patch had conflicts when merged, resolved by Committer: Patrick Wendell <pwendell@gmail.com> Closes #457 from mateiz/better-docs and squashes the following commits: a63d4a3 [Matei Zaharia] Skip Java/Scala API docs for Python package 5ea1f43 [Matei Zaharia] Fix links to Java classes in Java guide, fix some JS for scrolling to anchors on page load f05abc0 [Matei Zaharia] Don't include java.lang package names 995e992 [Matei Zaharia] Skip internal packages and class names with $ in JavaDoc a14a93c [Matei Zaharia] typo 76ce64d [Matei Zaharia] Add groups to Javadoc index page, and a first package-info.java ed6f994 [Matei Zaharia] Generate JavaDoc as well, add titles, update doc site to use unified docs acb993d [Matei Zaharia] Add Unidoc plugin for the projects we want Unidoced
* SPARK-1183. Don't use "worker" to mean executorSandy Ryza2014-03-131-1/+1
| | | | | | | | | | | | Author: Sandy Ryza <sandy@cloudera.com> Closes #120 from sryza/sandy-spark-1183 and squashes the following commits: 5066a4a [Sandy Ryza] Remove "worker" in a couple comments 0bd1e46 [Sandy Ryza] Remove --am-class from usage bfc8fe0 [Sandy Ryza] Remove am-class from doc and fix yarn-alpha 607539f [Sandy Ryza] Address review comments 74d087a [Sandy Ryza] SPARK-1183. Don't use "worker" to mean executor
* Merge pull request #436 from ankurdave/VertexId-caseReynold Xin2014-01-141-35/+35
|\ | | | | | | Rename VertexID -> VertexId in GraphX
| * VertexID -> VertexIdAnkur Dave2014-01-141-35/+35
| |
* | Merge pull request #424 from jegonzal/GraphXProgrammingGuideReynold Xin2014-01-141-52/+121
|\ \ | |/ |/| | | | | | | Additional edits for clarity in the graphx programming guide. Added an overview of the Graph and GraphOps functions and fixed numerous typos.
| * Additional edits for clarity in the graphx programming guide.Joseph E. Gonzalez2014-01-141-52/+121
| |
* | Describe GraphX caching and uncaching in guideAnkur Dave2014-01-141-1/+10
|/
* Improving the graphx-programming-guide.Joseph E. Gonzalez2014-01-141-26/+37
|
* adding documentation about EdgeRDDJoseph E. Gonzalez2014-01-131-2/+40
|
* Fix all code examples in guideAnkur Dave2014-01-131-23/+23
|
* Finish 6f6f8c928ce493357d4d32e46971c5e401682ea8Ankur Dave2014-01-131-2/+4
|
* Wrap methods in the appropriate class/object declarationAnkur Dave2014-01-131-64/+85
|
* Write Graph Builders section in guideAnkur Dave2014-01-131-5/+49
|
* Remove K-Core and LDA sections from guide; they are unimplementedAnkur Dave2014-01-131-4/+0
|
* Fix Pregel SSSP example in programming guideAnkur Dave2014-01-131-8/+14
|
* Finished documenting vertexrdd.Joseph E. Gonzalez2014-01-131-0/+53
|
* Finished second pass on pregel docs.Joseph E. Gonzalez2014-01-131-12/+33
|
* Minor changes in graphx programming guide.Joseph E. Gonzalez2014-01-131-3/+2
|
* Improvements in example code for the programming guide as well as adding ↵Joseph E. Gonzalez2014-01-131-17/+22
| | | | serialization support for GraphImpl to address issues with failed closure capture.
* Remove aggregateNeighborsAnkur Dave2014-01-131-17/+0
|
* Merge pull request #2 from jegonzal/GraphXCCIssueAnkur Dave2014-01-131-6/+27
|\ | | | | Improving documentation and identifying potential bug in CC calculation.
| * Improving documentation and identifying potential bug in CC calculation.Joseph E. Gonzalez2014-01-131-6/+27
| |
* | Add graph loader links to docAnkur Dave2014-01-131-0/+13
| |
* | Fix mapReduceTriplets links in docAnkur Dave2014-01-131-4/+4
|/
* Tested and corrected all examples up to mask in the graphx-programming-guide.Joseph E. Gonzalez2014-01-121-17/+20
|
* Use GraphLoader for algorithms examples in docAnkur Dave2014-01-121-17/+19
|
* Move algorithms to GraphOpsAnkur Dave2014-01-121-9/+3
|
* Add TriangleCount exampleAnkur Dave2014-01-121-4/+27
|
* Documenting Pregel APIJoseph E. Gonzalez2014-01-121-1/+198
|
* Add connected components example to docAnkur Dave2014-01-121-1/+19
|
* Add PageRank example and dataAnkur Dave2014-01-121-1/+31
|
* Link methods in programming guide; document VertexIDAnkur Dave2014-01-121-69/+86
|
* Correcting typos in documentation.Joseph E. Gonzalez2014-01-111-66/+79
|
* Finished docummenting join operators and revised some of the initial ↵Joseph E. Gonzalez2014-01-111-37/+82
| | | | presentation.
* Remove GraphLabAnkur Dave2014-01-111-7/+6
|
* Finished documenting structural operators and starting join operators.Joseph E. Gonzalez2014-01-111-18/+72
|
* starting structural operator discussion.Joseph E. Gonzalez2014-01-111-2/+30
|
* More organizational changes and dropping the benchmark plot.Joseph E. Gonzalez2014-01-111-12/+20
|
* More edits.Joseph E. Gonzalez2014-01-101-16/+215
|
* Add back Bagel links to docs, but mark them supersededAnkur Dave2014-01-101-7/+7
|
* WIP. Updating figures and cleaning up initial skeleton for GraphX ↵Joseph E. Gonzalez2014-01-101-151/+126
| | | | Programming guide.
* Start fixing formatting of graphx-programming-guideAnkur Dave2014-01-091-7/+6
|