spark - Mirror of Apache Spark

	Commit message (Collapse)	Author	Age	Files	Lines
*	[SPARK-2661][bagel]unpersist old processed rdd	Daoyuan	2014-07-24	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \|	Unpersist useless rdd during bagel iteration to make full use of memory. Author: Daoyuan <daoyuan.wang@intel.com> Closes #1519 from adrian-wang/bagelunpersist and squashes the following commits: 182c9dd [Daoyuan] rename var nextUseless to lastRDD 87fd3a4 [Daoyuan] bagel unpersist old processed rdd
*	Package docs	Prashant Sharma	2014-05-14	2	-0/+44
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a few changes based on the original patch by @scrapcodes. Author: Prashant Sharma <prashant.s@imaginea.com> Author: Patrick Wendell <pwendell@gmail.com> Closes #785 from pwendell/package-docs and squashes the following commits: c32b731 [Patrick Wendell] Changes based on Prashant's patch c0463d3 [Prashant Sharma] added eof new line ce8bf73 [Prashant Sharma] Added eof new line to all files. 4c35f2e [Prashant Sharma] SPARK-1563 Add package-info.java and package.scala files for all packages that appear in docs
*	Spark 1271: Co-Group and Group-By should pass Iterable[X]	Holden Karau	2014-04-08	1	-8/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Author: Holden Karau <holden@pigscanfly.ca> Closes #242 from holdenk/spark-1320-cogroupandgroupshouldpassiterator and squashes the following commits: f289536 [Holden Karau] Fix bad merge, should have been Iterable rather than Iterator 77048f8 [Holden Karau] Fix merge up to master d3fe909 [Holden Karau] use toSeq instead 7a092a3 [Holden Karau] switch resultitr to resultiterable eb06216 [Holden Karau] maybe I should have had a coffee first. use correct import for guava iterables c5075aa [Holden Karau] If guava 14 had iterables 2d06e10 [Holden Karau] Fix Java 8 cogroup tests for the new API 11e730c [Holden Karau] Fix streaming tests 66b583d [Holden Karau] Fix the core test suite to compile 4ed579b [Holden Karau] Refactor from iterator to iterable d052c07 [Holden Karau] Python tests now pass with iterator pandas 3bcd81d [Holden Karau] Revert "Try and make pickling list iterators work" cd1e81c [Holden Karau] Try and make pickling list iterators work c60233a [Holden Karau] Start investigating moving to iterators for python API like the Java/Scala one. tl;dr: We will have to write our own iterator since the default one doesn't pickle well 88a5cef [Holden Karau] Fix cogroup test in JavaAPISuite for streaming a5ee714 [Holden Karau] oops, was checking wrong iterator e687f21 [Holden Karau] Fix groupbykey test in JavaAPISuite of streaming ec8cc3e [Holden Karau] Fix test issues\! 4b0eeb9 [Holden Karau] Switch cast in PairDStreamFunctions fa395c9 [Holden Karau] Revert "Add a join based on the problem in SVD" ec99e32 [Holden Karau] Revert "Revert this but for now put things in list pandas" b692868 [Holden Karau] Revert 7e533f7 [Holden Karau] Fix the bug 8a5153a [Holden Karau] Revert me, but we have some stuff to debug b4e86a9 [Holden Karau] Add a join based on the problem in SVD c4510e2 [Holden Karau] Revert this but for now put things in list pandas b4e0b1d [Holden Karau] Fix style issues 71e8b9f [Holden Karau] I really need to stop calling size on iterators, it is the path of sadness. b1ae51a [Holden Karau] Fix some of the types in the streaming JavaAPI suite. Probably still needs more work 37888ec [Holden Karau] core/tests now pass 249abde [Holden Karau] org.apache.spark.rdd.PairRDDFunctionsSuite passes 6698186 [Holden Karau] Revert "I think this might be a bad rabbit hole. Started work to make CoGroupedRDD use iterator and then went crazy" fe992fe [Holden Karau] hmmm try and fix up basic operation suite 172705c [Holden Karau] Fix Java API suite caafa63 [Holden Karau] I think this might be a bad rabbit hole. Started work to make CoGroupedRDD use iterator and then went crazy 88b3329 [Holden Karau] Fix groupbykey to actually give back an iterator 4991af6 [Holden Karau] Fix some tests be50246 [Holden Karau] Calling size on an iterator is not so good if we want to use it after 687ffbc [Holden Karau] This is the it compiles point of replacing Seq with Iterator and JList with JIterator in the groupby and cogroup signatures
*	SPARK 1084.1 (resubmitted)	Sean Owen	2014-02-27	1	-7/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(Ported from https://github.com/apache/incubator-spark/pull/637 ) Author: Sean Owen <sowen@cloudera.com> Closes #31 from srowen/SPARK-1084.1 and squashes the following commits: 6c4a32c [Sean Owen] Suppress warnings about legitimate unchecked array creations, or change code to avoid it f35b833 [Sean Owen] Fix two misc javadoc problems 254e8ef [Sean Owen] Fix one new style error introduced in scaladoc warning commit 5b2fce2 [Sean Owen] Fix scaladoc invocation warning, and enable javac warnings properly, with plugin config updates 007762b [Sean Owen] Remove dead scaladoc links b8ff8cb [Sean Owen] Replace deprecated Ant <tasks> with <target>
*	Merge pull request #567 from ScrapCodes/style2.	Prashant Sharma	2014-02-09	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	SPARK-1058, Fix Style Errors and Add Scala Style to Spark Build. Pt 2 Continuation of PR #557 With this all scala style errors are fixed across the code base !! The reason for creating a separate PR was to not interrupt an already reviewed and ready to merge PR. Hope this gets reviewed soon and merged too. Author: Prashant Sharma <prashant.s@imaginea.com> Closes #567 and squashes the following commits: 3b1ec30 [Prashant Sharma] scala style fixes
*	Merge pull request #557 from ScrapCodes/style. Closes #557.	Patrick Wendell	2014-02-09	1	-23/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	SPARK-1058, Fix Style Errors and Add Scala Style to Spark Build. Author: Patrick Wendell <pwendell@gmail.com> Author: Prashant Sharma <scrapcodes@gmail.com> == Merge branch commits == commit 1a8bd1c059b842cb95cc246aaea74a79fec684f4 Author: Prashant Sharma <scrapcodes@gmail.com> Date: Sun Feb 9 17:39:07 2014 +0530 scala style fixes commit f91709887a8e0b608c5c2b282db19b8a44d53a43 Author: Patrick Wendell <pwendell@gmail.com> Date: Fri Jan 24 11:22:53 2014 -0800 Adding scalastyle snapshot
*	Move some classes to more appropriate packages:	Matei Zaharia	2013-09-01	1	-1/+1
\| \| \| \| \| \|	* RDD, RDDFunctions -> org.apache.spark.rdd Utils, ClosureCleaner, SizeEstimator -> org.apache.spark.util * JavaSerializer, KryoSerializer -> org.apache.spark.serializer
*	Initial work to rename package to org.apache.spark	Matei Zaharia	2013-09-01	1	-18/+17
\|
*	Change build and run instructions to use assemblies	Matei Zaharia	2013-08-29	3	-447/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit makes Spark invocation saner by using an assembly JAR to find all of Spark's dependencies instead of adding all the JARs in lib_managed. It also packages the examples into an assembly and uses that as SPARK_EXAMPLES_JAR. Finally, it replaces the old "run" script with two better-named scripts: "run-examples" for examples, and "spark-class" for Spark internal classes (e.g. REPL, master, etc). This is also designed to minimize the confusion people have in trying to use "run" to run their own classes; it's not meant to do that, but now at least if they look at it, they can modify run-examples to do a decent job for them. As part of this, Bagel's examples are also now properly moved to the examples package instead of bagel.
*	Add Apache license headers and LICENSE and NOTICE files	Matei Zaharia	2013-07-16	4	-0/+68
\|
*	Fix passing of superstep in Bagel to avoid seeing new values of the	Matei Zaharia	2013-04-08	1	-3/+3
\| \| \| \| \|	superstep value upon recomputation, and set the default storage level in Bagel to MEMORY_AND_DISK
*	Fix doc style	Nick Pentreath	2013-03-11	1	-7/+13
\|
*	Added choice of persitance level to Bagel. Also added documentation.	Nick Pentreath	2013-03-09	1	-8/+83
\|
*	Renamed "splits" to "partitions"	Matei Zaharia	2013-02-17	3	-14/+14
\|
*	Formatting fixes	Matei Zaharia	2013-02-11	1	-13/+9
\|
*	Some doc fixes, including showing version number in nav bar again	Matei Zaharia	2012-10-13	1	-0/+5
\|
*	More doc updates, and moved Serializer to a subpackage.	Matei Zaharia	2012-10-12	1	-5/+6
\|
*	Changed the way tasks' dependency files are sent to workers so that	Matei Zaharia	2012-09-28	1	-1/+4
\| \| \| \|	custom serializers or Kryo registrators can be loaded.
*	Merge branch 'master' into dev	Matei Zaharia	2012-06-15	2	-3/+1
\|\
\| *	Performance improvements to shuffle operations: in particular, preserve	Matei Zaharia	2012-06-09	2	-3/+1
\| \| \| \| \| \| \| \| \| \|	RDD partitioning in more cases where it's possible, and use iterators instead of materializing collections when doing joins.
* \|	Merge in engine improvements from the Spark Streaming project, developed	Matei Zaharia	2012-06-07	1	-5/+6
\|/ \| \| \| \| \|	jointly with Tathagata Das and Haoyuan Li. This commit imports the changes and ports them to Mesos 0.9, but does not yet pass unit tests due to various classes not supporting a graceful stop() yet.
*	Added an option (spark.closure.serializer) to specify the serializer for	Reynold Xin	2012-04-09	1	-0/+4
\| \| \| \|	closures. This enables using Kryo as the closure serializer.
*	Implement standalone WikipediaPageRank with custom serializer	Ankur Dave	2011-10-09	1	-0/+198
\|
*	Update WikipediaPageRank to reflect Bagel API changes	Ankur Dave	2011-10-09	2	-100/+129
\|
*	Remove ShortestPath for now	Ankur Dave	2011-10-09	1	-95/+0
\|
*	Simplify and genericize type parameters in Bagel	Ankur Dave	2011-10-09	1	-85/+129
\|
*	Fix issue #65: Change @serializable to extends Serializable in 2.9 branch	Ismael Juma	2011-08-02	3	-21/+16
\| \| \| \| \| \|	Note that we use scala.Serializable introduced in Scala 2.9 instead of java.io.Serializable. Also, case classes inherit from scala.Serializable by default.
*	Cleaned up a few issues to do with default parallelism levels. Also	Matei Zaharia	2011-07-14	1	-1/+1
\| \| \| \| \|	renamed HadoopFileWriter to HadoopWriter (since it's not only for files) and fixed a bug for lookup().
*	Rename bagel to spark.bagel and Pregel to Bagel	Ankur Dave	2011-05-09	3	-14/+14
\|
*	Move shortest path and PageRank to bagel.examples	Ankur Dave	2011-05-03	2	-2/+7
\|
*	Refactor and add aggregator support	Ankur Dave	2011-05-03	3	-40/+82
\| \| \| \| \| \| \| \|	Refactored out the agg() and comp() methods from Pregel.run. Defined an implicit conversion to allow applications that don't use aggregators to avoid including a null argument for the result of the aggregator in the compute function.
*	Package combiner functions into a trait	Ankur Dave	2011-05-03	3	-58/+58
\|
*	Add Bagel test suite	Ankur Dave	2011-05-03	1	-0/+8
\| \| \| \| \| \|	Note: This test suite currently fails for the same reason that the Spark Core test suite fails: Spark currently seems to have a bug where any test after the first one fails.
*	Clean up Bagel source and interface	Ankur Dave	2011-05-03	3	-124/+99
\|
*	Update ShortestPath to work with controllable partitioning	Ankur Dave	2011-05-03	1	-9/+5
\|
*	Clean up Pregel.run, add logging	Ankur Dave	2011-05-03	1	-26/+23
\|
*	Add Bagel, an implementation of Pregel on Spark	Ankur Dave	2011-05-03	3	-0/+390