aboutsummaryrefslogtreecommitdiff
path: root/bagel/src/main
Commit message (Collapse)AuthorAgeFilesLines
* [SPARK-2661][bagel]unpersist old processed rddDaoyuan2014-07-241-0/+5
| | | | | | | | | | | Unpersist useless rdd during bagel iteration to make full use of memory. Author: Daoyuan <daoyuan.wang@intel.com> Closes #1519 from adrian-wang/bagelunpersist and squashes the following commits: 182c9dd [Daoyuan] rename var nextUseless to lastRDD 87fd3a4 [Daoyuan] bagel unpersist old processed rdd
* Package docsPrashant Sharma2014-05-142-0/+44
| | | | | | | | | | | | | | This is a few changes based on the original patch by @scrapcodes. Author: Prashant Sharma <prashant.s@imaginea.com> Author: Patrick Wendell <pwendell@gmail.com> Closes #785 from pwendell/package-docs and squashes the following commits: c32b731 [Patrick Wendell] Changes based on Prashant's patch c0463d3 [Prashant Sharma] added eof new line ce8bf73 [Prashant Sharma] Added eof new line to all files. 4c35f2e [Prashant Sharma] SPARK-1563 Add package-info.java and package.scala files for all packages that appear in docs
* Spark 1271: Co-Group and Group-By should pass Iterable[X]Holden Karau2014-04-081-8/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Author: Holden Karau <holden@pigscanfly.ca> Closes #242 from holdenk/spark-1320-cogroupandgroupshouldpassiterator and squashes the following commits: f289536 [Holden Karau] Fix bad merge, should have been Iterable rather than Iterator 77048f8 [Holden Karau] Fix merge up to master d3fe909 [Holden Karau] use toSeq instead 7a092a3 [Holden Karau] switch resultitr to resultiterable eb06216 [Holden Karau] maybe I should have had a coffee first. use correct import for guava iterables c5075aa [Holden Karau] If guava 14 had iterables 2d06e10 [Holden Karau] Fix Java 8 cogroup tests for the new API 11e730c [Holden Karau] Fix streaming tests 66b583d [Holden Karau] Fix the core test suite to compile 4ed579b [Holden Karau] Refactor from iterator to iterable d052c07 [Holden Karau] Python tests now pass with iterator pandas 3bcd81d [Holden Karau] Revert "Try and make pickling list iterators work" cd1e81c [Holden Karau] Try and make pickling list iterators work c60233a [Holden Karau] Start investigating moving to iterators for python API like the Java/Scala one. tl;dr: We will have to write our own iterator since the default one doesn't pickle well 88a5cef [Holden Karau] Fix cogroup test in JavaAPISuite for streaming a5ee714 [Holden Karau] oops, was checking wrong iterator e687f21 [Holden Karau] Fix groupbykey test in JavaAPISuite of streaming ec8cc3e [Holden Karau] Fix test issues\! 4b0eeb9 [Holden Karau] Switch cast in PairDStreamFunctions fa395c9 [Holden Karau] Revert "Add a join based on the problem in SVD" ec99e32 [Holden Karau] Revert "Revert this but for now put things in list pandas" b692868 [Holden Karau] Revert 7e533f7 [Holden Karau] Fix the bug 8a5153a [Holden Karau] Revert me, but we have some stuff to debug b4e86a9 [Holden Karau] Add a join based on the problem in SVD c4510e2 [Holden Karau] Revert this but for now put things in list pandas b4e0b1d [Holden Karau] Fix style issues 71e8b9f [Holden Karau] I really need to stop calling size on iterators, it is the path of sadness. b1ae51a [Holden Karau] Fix some of the types in the streaming JavaAPI suite. Probably still needs more work 37888ec [Holden Karau] core/tests now pass 249abde [Holden Karau] org.apache.spark.rdd.PairRDDFunctionsSuite passes 6698186 [Holden Karau] Revert "I think this might be a bad rabbit hole. Started work to make CoGroupedRDD use iterator and then went crazy" fe992fe [Holden Karau] hmmm try and fix up basic operation suite 172705c [Holden Karau] Fix Java API suite caafa63 [Holden Karau] I think this might be a bad rabbit hole. Started work to make CoGroupedRDD use iterator and then went crazy 88b3329 [Holden Karau] Fix groupbykey to actually give back an iterator 4991af6 [Holden Karau] Fix some tests be50246 [Holden Karau] Calling size on an iterator is not so good if we want to use it after 687ffbc [Holden Karau] This is the it compiles point of replacing Seq with Iterator and JList with JIterator in the groupby and cogroup signatures
* SPARK 1084.1 (resubmitted)Sean Owen2014-02-271-7/+7
| | | | | | | | | | | | | | | (Ported from https://github.com/apache/incubator-spark/pull/637 ) Author: Sean Owen <sowen@cloudera.com> Closes #31 from srowen/SPARK-1084.1 and squashes the following commits: 6c4a32c [Sean Owen] Suppress warnings about legitimate unchecked array creations, or change code to avoid it f35b833 [Sean Owen] Fix two misc javadoc problems 254e8ef [Sean Owen] Fix one new style error introduced in scaladoc warning commit 5b2fce2 [Sean Owen] Fix scaladoc invocation warning, and enable javac warnings properly, with plugin config updates 007762b [Sean Owen] Remove dead scaladoc links b8ff8cb [Sean Owen] Replace deprecated Ant <tasks> with <target>
* Merge pull request #567 from ScrapCodes/style2.Prashant Sharma2014-02-091-1/+2
| | | | | | | | | | | | | | | | SPARK-1058, Fix Style Errors and Add Scala Style to Spark Build. Pt 2 Continuation of PR #557 With this all scala style errors are fixed across the code base !! The reason for creating a separate PR was to not interrupt an already reviewed and ready to merge PR. Hope this gets reviewed soon and merged too. Author: Prashant Sharma <prashant.s@imaginea.com> Closes #567 and squashes the following commits: 3b1ec30 [Prashant Sharma] scala style fixes
* Merge pull request #557 from ScrapCodes/style. Closes #557.Patrick Wendell2014-02-091-23/+32
| | | | | | | | | | | | | | | | | | | | | SPARK-1058, Fix Style Errors and Add Scala Style to Spark Build. Author: Patrick Wendell <pwendell@gmail.com> Author: Prashant Sharma <scrapcodes@gmail.com> == Merge branch commits == commit 1a8bd1c059b842cb95cc246aaea74a79fec684f4 Author: Prashant Sharma <scrapcodes@gmail.com> Date: Sun Feb 9 17:39:07 2014 +0530 scala style fixes commit f91709887a8e0b608c5c2b282db19b8a44d53a43 Author: Patrick Wendell <pwendell@gmail.com> Date: Fri Jan 24 11:22:53 2014 -0800 Adding scalastyle snapshot
* Move some classes to more appropriate packages:Matei Zaharia2013-09-011-1/+1
| | | | | | * RDD, *RDDFunctions -> org.apache.spark.rdd * Utils, ClosureCleaner, SizeEstimator -> org.apache.spark.util * JavaSerializer, KryoSerializer -> org.apache.spark.serializer
* Initial work to rename package to org.apache.sparkMatei Zaharia2013-09-011-18/+17
|
* Change build and run instructions to use assembliesMatei Zaharia2013-08-293-447/+0
| | | | | | | | | | | | | | | | This commit makes Spark invocation saner by using an assembly JAR to find all of Spark's dependencies instead of adding all the JARs in lib_managed. It also packages the examples into an assembly and uses that as SPARK_EXAMPLES_JAR. Finally, it replaces the old "run" script with two better-named scripts: "run-examples" for examples, and "spark-class" for Spark internal classes (e.g. REPL, master, etc). This is also designed to minimize the confusion people have in trying to use "run" to run their own classes; it's not meant to do that, but now at least if they look at it, they can modify run-examples to do a decent job for them. As part of this, Bagel's examples are also now properly moved to the examples package instead of bagel.
* Add Apache license headers and LICENSE and NOTICE filesMatei Zaharia2013-07-164-0/+68
|
* Fix passing of superstep in Bagel to avoid seeing new values of theMatei Zaharia2013-04-081-3/+3
| | | | | superstep value upon recomputation, and set the default storage level in Bagel to MEMORY_AND_DISK
* Fix doc styleNick Pentreath2013-03-111-7/+13
|
* Added choice of persitance level to Bagel. Also added documentation.Nick Pentreath2013-03-091-8/+83
|
* Renamed "splits" to "partitions"Matei Zaharia2013-02-173-14/+14
|
* Formatting fixesMatei Zaharia2013-02-111-13/+9
|
* Some doc fixes, including showing version number in nav bar againMatei Zaharia2012-10-131-0/+5
|
* More doc updates, and moved Serializer to a subpackage.Matei Zaharia2012-10-121-5/+6
|
* Changed the way tasks' dependency files are sent to workers so thatMatei Zaharia2012-09-281-1/+4
| | | | custom serializers or Kryo registrators can be loaded.
* Merge branch 'master' into devMatei Zaharia2012-06-152-3/+1
|\
| * Performance improvements to shuffle operations: in particular, preserveMatei Zaharia2012-06-092-3/+1
| | | | | | | | | | RDD partitioning in more cases where it's possible, and use iterators instead of materializing collections when doing joins.
* | Merge in engine improvements from the Spark Streaming project, developedMatei Zaharia2012-06-071-5/+6
|/ | | | | | jointly with Tathagata Das and Haoyuan Li. This commit imports the changes and ports them to Mesos 0.9, but does not yet pass unit tests due to various classes not supporting a graceful stop() yet.
* Added an option (spark.closure.serializer) to specify the serializer forReynold Xin2012-04-091-0/+4
| | | | closures. This enables using Kryo as the closure serializer.
* Implement standalone WikipediaPageRank with custom serializerAnkur Dave2011-10-091-0/+198
|
* Update WikipediaPageRank to reflect Bagel API changesAnkur Dave2011-10-092-100/+129
|
* Remove ShortestPath for nowAnkur Dave2011-10-091-95/+0
|
* Simplify and genericize type parameters in BagelAnkur Dave2011-10-091-85/+129
|
* Fix issue #65: Change @serializable to extends Serializable in 2.9 branchIsmael Juma2011-08-023-21/+16
| | | | | | Note that we use scala.Serializable introduced in Scala 2.9 instead of java.io.Serializable. Also, case classes inherit from scala.Serializable by default.
* Cleaned up a few issues to do with default parallelism levels. AlsoMatei Zaharia2011-07-141-1/+1
| | | | | renamed HadoopFileWriter to HadoopWriter (since it's not only for files) and fixed a bug for lookup().
* Rename bagel to spark.bagel and Pregel to BagelAnkur Dave2011-05-093-14/+14
|
* Move shortest path and PageRank to bagel.examplesAnkur Dave2011-05-032-2/+7
|
* Refactor and add aggregator supportAnkur Dave2011-05-033-40/+82
| | | | | | | | Refactored out the agg() and comp() methods from Pregel.run. Defined an implicit conversion to allow applications that don't use aggregators to avoid including a null argument for the result of the aggregator in the compute function.
* Package combiner functions into a traitAnkur Dave2011-05-033-58/+58
|
* Add Bagel test suiteAnkur Dave2011-05-031-0/+8
| | | | | | Note: This test suite currently fails for the same reason that the Spark Core test suite fails: Spark currently seems to have a bug where any test after the first one fails.
* Clean up Bagel source and interfaceAnkur Dave2011-05-033-124/+99
|
* Update ShortestPath to work with controllable partitioningAnkur Dave2011-05-031-9/+5
|
* Clean up Pregel.run, add loggingAnkur Dave2011-05-031-26/+23
|
* Add Bagel, an implementation of Pregel on SparkAnkur Dave2011-05-033-0/+390