| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
| |
Unpersist useless rdd during bagel iteration to make full use of memory.
Author: Daoyuan <daoyuan.wang@intel.com>
Closes #1519 from adrian-wang/bagelunpersist and squashes the following commits:
182c9dd [Daoyuan] rename var nextUseless to lastRDD
87fd3a4 [Daoyuan] bagel unpersist old processed rdd
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a few changes based on the original patch by @scrapcodes.
Author: Prashant Sharma <prashant.s@imaginea.com>
Author: Patrick Wendell <pwendell@gmail.com>
Closes #785 from pwendell/package-docs and squashes the following commits:
c32b731 [Patrick Wendell] Changes based on Prashant's patch
c0463d3 [Prashant Sharma] added eof new line
ce8bf73 [Prashant Sharma] Added eof new line to all files.
4c35f2e [Prashant Sharma] SPARK-1563 Add package-info.java and package.scala files for all packages that appear in docs
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Author: Holden Karau <holden@pigscanfly.ca>
Closes #242 from holdenk/spark-1320-cogroupandgroupshouldpassiterator and squashes the following commits:
f289536 [Holden Karau] Fix bad merge, should have been Iterable rather than Iterator
77048f8 [Holden Karau] Fix merge up to master
d3fe909 [Holden Karau] use toSeq instead
7a092a3 [Holden Karau] switch resultitr to resultiterable
eb06216 [Holden Karau] maybe I should have had a coffee first. use correct import for guava iterables
c5075aa [Holden Karau] If guava 14 had iterables
2d06e10 [Holden Karau] Fix Java 8 cogroup tests for the new API
11e730c [Holden Karau] Fix streaming tests
66b583d [Holden Karau] Fix the core test suite to compile
4ed579b [Holden Karau] Refactor from iterator to iterable
d052c07 [Holden Karau] Python tests now pass with iterator pandas
3bcd81d [Holden Karau] Revert "Try and make pickling list iterators work"
cd1e81c [Holden Karau] Try and make pickling list iterators work
c60233a [Holden Karau] Start investigating moving to iterators for python API like the Java/Scala one. tl;dr: We will have to write our own iterator since the default one doesn't pickle well
88a5cef [Holden Karau] Fix cogroup test in JavaAPISuite for streaming
a5ee714 [Holden Karau] oops, was checking wrong iterator
e687f21 [Holden Karau] Fix groupbykey test in JavaAPISuite of streaming
ec8cc3e [Holden Karau] Fix test issues\!
4b0eeb9 [Holden Karau] Switch cast in PairDStreamFunctions
fa395c9 [Holden Karau] Revert "Add a join based on the problem in SVD"
ec99e32 [Holden Karau] Revert "Revert this but for now put things in list pandas"
b692868 [Holden Karau] Revert
7e533f7 [Holden Karau] Fix the bug
8a5153a [Holden Karau] Revert me, but we have some stuff to debug
b4e86a9 [Holden Karau] Add a join based on the problem in SVD
c4510e2 [Holden Karau] Revert this but for now put things in list pandas
b4e0b1d [Holden Karau] Fix style issues
71e8b9f [Holden Karau] I really need to stop calling size on iterators, it is the path of sadness.
b1ae51a [Holden Karau] Fix some of the types in the streaming JavaAPI suite. Probably still needs more work
37888ec [Holden Karau] core/tests now pass
249abde [Holden Karau] org.apache.spark.rdd.PairRDDFunctionsSuite passes
6698186 [Holden Karau] Revert "I think this might be a bad rabbit hole. Started work to make CoGroupedRDD use iterator and then went crazy"
fe992fe [Holden Karau] hmmm try and fix up basic operation suite
172705c [Holden Karau] Fix Java API suite
caafa63 [Holden Karau] I think this might be a bad rabbit hole. Started work to make CoGroupedRDD use iterator and then went crazy
88b3329 [Holden Karau] Fix groupbykey to actually give back an iterator
4991af6 [Holden Karau] Fix some tests
be50246 [Holden Karau] Calling size on an iterator is not so good if we want to use it after
687ffbc [Holden Karau] This is the it compiles point of replacing Seq with Iterator and JList with JIterator in the groupby and cogroup signatures
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(Ported from https://github.com/apache/incubator-spark/pull/637 )
Author: Sean Owen <sowen@cloudera.com>
Closes #31 from srowen/SPARK-1084.1 and squashes the following commits:
6c4a32c [Sean Owen] Suppress warnings about legitimate unchecked array creations, or change code to avoid it
f35b833 [Sean Owen] Fix two misc javadoc problems
254e8ef [Sean Owen] Fix one new style error introduced in scaladoc warning commit
5b2fce2 [Sean Owen] Fix scaladoc invocation warning, and enable javac warnings properly, with plugin config updates
007762b [Sean Owen] Remove dead scaladoc links
b8ff8cb [Sean Owen] Replace deprecated Ant <tasks> with <target>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
SPARK-1058, Fix Style Errors and Add Scala Style to Spark Build. Pt 2
Continuation of PR #557
With this all scala style errors are fixed across the code base !!
The reason for creating a separate PR was to not interrupt an already reviewed and ready to merge PR. Hope this gets reviewed soon and merged too.
Author: Prashant Sharma <prashant.s@imaginea.com>
Closes #567 and squashes the following commits:
3b1ec30 [Prashant Sharma] scala style fixes
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
SPARK-1058, Fix Style Errors and Add Scala Style to Spark Build.
Author: Patrick Wendell <pwendell@gmail.com>
Author: Prashant Sharma <scrapcodes@gmail.com>
== Merge branch commits ==
commit 1a8bd1c059b842cb95cc246aaea74a79fec684f4
Author: Prashant Sharma <scrapcodes@gmail.com>
Date: Sun Feb 9 17:39:07 2014 +0530
scala style fixes
commit f91709887a8e0b608c5c2b282db19b8a44d53a43
Author: Patrick Wendell <pwendell@gmail.com>
Date: Fri Jan 24 11:22:53 2014 -0800
Adding scalastyle snapshot
|
|
|
|
|
|
| |
* RDD, *RDDFunctions -> org.apache.spark.rdd
* Utils, ClosureCleaner, SizeEstimator -> org.apache.spark.util
* JavaSerializer, KryoSerializer -> org.apache.spark.serializer
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit makes Spark invocation saner by using an assembly JAR to
find all of Spark's dependencies instead of adding all the JARs in
lib_managed. It also packages the examples into an assembly and uses
that as SPARK_EXAMPLES_JAR. Finally, it replaces the old "run" script
with two better-named scripts: "run-examples" for examples, and
"spark-class" for Spark internal classes (e.g. REPL, master, etc). This
is also designed to minimize the confusion people have in trying to use
"run" to run their own classes; it's not meant to do that, but now at
least if they look at it, they can modify run-examples to do a decent
job for them.
As part of this, Bagel's examples are also now properly moved to the
examples package instead of bagel.
|
| |
|
|
|
|
|
| |
superstep value upon recomputation, and set the default storage level in
Bagel to MEMORY_AND_DISK
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
| |
custom serializers or Kryo registrators can be loaded.
|
|\ |
|
| |
| |
| |
| |
| | |
RDD partitioning in more cases where it's possible, and use iterators
instead of materializing collections when doing joins.
|
|/
|
|
|
|
| |
jointly with Tathagata Das and Haoyuan Li. This commit imports the changes
and ports them to Mesos 0.9, but does not yet pass unit tests due to
various classes not supporting a graceful stop() yet.
|
|
|
|
| |
closures. This enables using Kryo as the closure serializer.
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
| |
Note that we use scala.Serializable introduced in Scala 2.9 instead of
java.io.Serializable. Also, case classes inherit from scala.Serializable by
default.
|
|
|
|
|
| |
renamed HadoopFileWriter to HadoopWriter (since it's not only for files)
and fixed a bug for lookup().
|
| |
|
| |
|
|
|
|
|
|
|
|
| |
Refactored out the agg() and comp() methods from Pregel.run.
Defined an implicit conversion to allow applications that don't use
aggregators to avoid including a null argument for the result of the
aggregator in the compute function.
|
| |
|
|
|
|
|
|
| |
Note: This test suite currently fails for the same reason that the
Spark Core test suite fails: Spark currently seems to have a bug where
any test after the first one fails.
|
| |
|
| |
|
| |
|
|
|