spark - Mirror of Apache Spark

	Commit message (Collapse)	Author	Age	Files	Lines
*	[SPARK-6758]block the right jetty package in log	WangTaoTheTonic	2015-04-09	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	https://issues.apache.org/jira/browse/SPARK-6758 I am not sure if it is ok to block them in test resources too (as we shade jetty in assembly?). Author: WangTaoTheTonic <wangtao111@huawei.com> Closes #5406 from WangTaoTheTonic/SPARK-6758 and squashes the following commits: e09605b [WangTaoTheTonic] block the right jetty package
*	SPARK-4159 [CORE] Maven build doesn't run JUnit test suites	Sean Owen	2015-01-06	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This PR: - Reenables `surefire`, and copies config from `scalatest` (which is itself an old fork of `surefire`, so similar) - Tells `surefire` to test only Java tests - Enables `surefire` and `scalatest` for all children, and in turn eliminates some duplication. For me this causes the Scala and Java tests to be run once each, it seems, as desired. It doesn't affect the SBT build but works for Maven. I still need to verify that all of the Scala tests and Java tests are being run. Author: Sean Owen <sowen@cloudera.com> Closes #3651 from srowen/SPARK-4159 and squashes the following commits: 2e8a0af [Sean Owen] Remove specialized SPARK_HOME setting for REPL, YARN tests as it appears to be obsolete 12e4558 [Sean Owen] Append to unit-test.log instead of overwriting, so that both surefire and scalatest output is preserved. Also standardize/correct comments a bit. e6f8601 [Sean Owen] Reenable Java tests by reenabling surefire with config cloned from scalatest; centralize test config in the parent
*	[SPARK-3748] Log thread name in unit test logs	Reynold Xin	2014-10-01	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	Thread names are useful for correlating failures. Author: Reynold Xin <rxin@apache.org> Closes #2600 from rxin/log4j and squashes the following commits: 83ffe88 [Reynold Xin] [SPARK-3748] Log thread name in unit test logs
*	SPARK-2482: Resolve sbt warnings during build	witgo	2014-09-11	1	-2/+0
\| \| \| \| \| \| \| \| \| \|	At the same time, import the `scala.language.postfixOps` and ` org.scalatest.time.SpanSugar._` cause `scala.language.postfixOps` doesn't work Author: witgo <witgo@qq.com> Closes #1330 from witgo/sbt_warnings3 and squashes the following commits: 179ba61 [witgo] Resolve sbt warnings during build
*	[SPARK-2661][bagel]unpersist old processed rdd	Daoyuan	2014-07-24	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \|	Unpersist useless rdd during bagel iteration to make full use of memory. Author: Daoyuan <daoyuan.wang@intel.com> Closes #1519 from adrian-wang/bagelunpersist and squashes the following commits: 182c9dd [Daoyuan] rename var nextUseless to lastRDD 87fd3a4 [Daoyuan] bagel unpersist old processed rdd
*	HOTFIX: Increase time limit for Bagel test	Ankur Dave	2014-06-10	1	-2/+2
\| \| \| \| \| \| \| \| \| \|	The test was timing out on some slow EC2 workers. Author: Ankur Dave <ankurdave@gmail.com> Closes #1037 from ankurdave/bagel-test-time-limit and squashes the following commits: 67fd487 [Ankur Dave] Increase time limit for Bagel test
*	[SPARK-1942] Stop clearing spark.driver.port in unit tests	Syed Hashmi	2014-06-03	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	stop resetting spark.driver.port in unit tests (scala, java and python). Author: Syed Hashmi <shashmi@cloudera.com> Author: CodingCat <zhunansjtu@gmail.com> Closes #943 from syedhashmi/master and squashes the following commits: 885f210 [Syed Hashmi] Removing unnecessary file (created by mergetool) b8bd4b5 [Syed Hashmi] Merge remote-tracking branch 'upstream/master' b895e59 [Syed Hashmi] Revert "[SPARK-1784] Add a new partitioner" 57b6587 [Syed Hashmi] Revert "[SPARK-1784] Add a balanced partitioner" 1574769 [Syed Hashmi] [SPARK-1942] Stop clearing spark.driver.port in unit tests 4354836 [Syed Hashmi] Revert "SPARK-1686: keep schedule() calling in the main thread" fd36542 [Syed Hashmi] [SPARK-1784] Add a balanced partitioner 6668015 [CodingCat] SPARK-1686: keep schedule() calling in the main thread 4ca94cc [Syed Hashmi] [SPARK-1784] Add a new partitioner
*	Package docs	Prashant Sharma	2014-05-14	2	-0/+44
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a few changes based on the original patch by @scrapcodes. Author: Prashant Sharma <prashant.s@imaginea.com> Author: Patrick Wendell <pwendell@gmail.com> Closes #785 from pwendell/package-docs and squashes the following commits: c32b731 [Patrick Wendell] Changes based on Prashant's patch c0463d3 [Prashant Sharma] added eof new line ce8bf73 [Prashant Sharma] Added eof new line to all files. 4c35f2e [Prashant Sharma] SPARK-1563 Add package-info.java and package.scala files for all packages that appear in docs
*	SPARK-1798. Tests should clean up temp files	Sean Owen	2014-05-12	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Three issues related to temp files that tests generate – these should be touched up for hygiene but are not urgent. Modules have a log4j.properties which directs the unit-test.log output file to a directory like `[module]/target/unit-test.log`. But this ends up creating `[module]/[module]/target/unit-test.log` instead of former. The `work/` directory is not deleted by "mvn clean", in the parent and in modules. Neither is the `checkpoint/` directory created under the various external modules. Many tests create a temp directory, which is not usually deleted. This can be largely resolved by calling `deleteOnExit()` at creation and trying to call `Utils.deleteRecursively` consistently to clean up, sometimes in an `@After` method. _If anyone seconds the motion, I can create a more significant change that introduces a new test trait along the lines of `LocalSparkContext`, which provides management of temp directories for subclasses to take advantage of._ Author: Sean Owen <sowen@cloudera.com> Closes #732 from srowen/SPARK-1798 and squashes the following commits: 5af578e [Sean Owen] Try to consistently delete test temp dirs and files, and set deleteOnExit() for each b21b356 [Sean Owen] Remove work/ and checkpoint/ dirs with mvn clean bdd0f41 [Sean Owen] Remove duplicate module dir in log4j.properties output path for tests
*	SPARK-1488. Resolve scalac feature warnings during build	Sean Owen	2014-04-14	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For your consideration: scalac currently notes a number of feature warnings during compilation: ``` [warn] there were 65 feature warning(s); re-run with -feature for details ``` Warnings are like: ``` [warn] /Users/srowen/Documents/spark/core/src/main/scala/org/apache/spark/SparkContext.scala:1261: implicit conversion method rddToPairRDDFunctions should be enabled [warn] by making the implicit value scala.language.implicitConversions visible. [warn] This can be achieved by adding the import clause 'import scala.language.implicitConversions' [warn] or by setting the compiler option -language:implicitConversions. [warn] See the Scala docs for value scala.language.implicitConversions for a discussion [warn] why the feature should be explicitly enabled. [warn] implicit def rddToPairRDDFunctions[K: ClassTag, V: ClassTag](rdd: RDD[(K, V)]) = [warn] ^ ``` scalac is suggesting that it's just best practice to explicitly enable certain language features by importing them where used. This PR simply adds the imports it suggests (and squashes one other Java warning along the way). This leaves just deprecation warnings in the build. Author: Sean Owen <sowen@cloudera.com> Closes #404 from srowen/SPARK-1488 and squashes the following commits: 8598980 [Sean Owen] Quiet scalac warnings about language features by explicitly importing language features. 39bc831 [Sean Owen] Enable -feature in scalac to emit language feature warnings
*	Remove Unnecessary Whitespace's	Sandeep	2014-04-10	1	-2/+2
\| \| \| \| \| \| \| \| \| \|	stack these together in a commit else they show up chunk by chunk in different commits. Author: Sandeep <sandeep@techaddict.me> Closes #380 from techaddict/white_space and squashes the following commits: b58f294 [Sandeep] Remove Unnecessary Whitespace's
*	Spark 1271: Co-Group and Group-By should pass Iterable[X]	Holden Karau	2014-04-08	1	-8/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Author: Holden Karau <holden@pigscanfly.ca> Closes #242 from holdenk/spark-1320-cogroupandgroupshouldpassiterator and squashes the following commits: f289536 [Holden Karau] Fix bad merge, should have been Iterable rather than Iterator 77048f8 [Holden Karau] Fix merge up to master d3fe909 [Holden Karau] use toSeq instead 7a092a3 [Holden Karau] switch resultitr to resultiterable eb06216 [Holden Karau] maybe I should have had a coffee first. use correct import for guava iterables c5075aa [Holden Karau] If guava 14 had iterables 2d06e10 [Holden Karau] Fix Java 8 cogroup tests for the new API 11e730c [Holden Karau] Fix streaming tests 66b583d [Holden Karau] Fix the core test suite to compile 4ed579b [Holden Karau] Refactor from iterator to iterable d052c07 [Holden Karau] Python tests now pass with iterator pandas 3bcd81d [Holden Karau] Revert "Try and make pickling list iterators work" cd1e81c [Holden Karau] Try and make pickling list iterators work c60233a [Holden Karau] Start investigating moving to iterators for python API like the Java/Scala one. tl;dr: We will have to write our own iterator since the default one doesn't pickle well 88a5cef [Holden Karau] Fix cogroup test in JavaAPISuite for streaming a5ee714 [Holden Karau] oops, was checking wrong iterator e687f21 [Holden Karau] Fix groupbykey test in JavaAPISuite of streaming ec8cc3e [Holden Karau] Fix test issues\! 4b0eeb9 [Holden Karau] Switch cast in PairDStreamFunctions fa395c9 [Holden Karau] Revert "Add a join based on the problem in SVD" ec99e32 [Holden Karau] Revert "Revert this but for now put things in list pandas" b692868 [Holden Karau] Revert 7e533f7 [Holden Karau] Fix the bug 8a5153a [Holden Karau] Revert me, but we have some stuff to debug b4e86a9 [Holden Karau] Add a join based on the problem in SVD c4510e2 [Holden Karau] Revert this but for now put things in list pandas b4e0b1d [Holden Karau] Fix style issues 71e8b9f [Holden Karau] I really need to stop calling size on iterators, it is the path of sadness. b1ae51a [Holden Karau] Fix some of the types in the streaming JavaAPI suite. Probably still needs more work 37888ec [Holden Karau] core/tests now pass 249abde [Holden Karau] org.apache.spark.rdd.PairRDDFunctionsSuite passes 6698186 [Holden Karau] Revert "I think this might be a bad rabbit hole. Started work to make CoGroupedRDD use iterator and then went crazy" fe992fe [Holden Karau] hmmm try and fix up basic operation suite 172705c [Holden Karau] Fix Java API suite caafa63 [Holden Karau] I think this might be a bad rabbit hole. Started work to make CoGroupedRDD use iterator and then went crazy 88b3329 [Holden Karau] Fix groupbykey to actually give back an iterator 4991af6 [Holden Karau] Fix some tests be50246 [Holden Karau] Calling size on an iterator is not so good if we want to use it after 687ffbc [Holden Karau] This is the it compiles point of replacing Seq with Iterator and JList with JIterator in the groupby and cogroup signatures
*	SPARK 1084.1 (resubmitted)	Sean Owen	2014-02-27	1	-7/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(Ported from https://github.com/apache/incubator-spark/pull/637 ) Author: Sean Owen <sowen@cloudera.com> Closes #31 from srowen/SPARK-1084.1 and squashes the following commits: 6c4a32c [Sean Owen] Suppress warnings about legitimate unchecked array creations, or change code to avoid it f35b833 [Sean Owen] Fix two misc javadoc problems 254e8ef [Sean Owen] Fix one new style error introduced in scaladoc warning commit 5b2fce2 [Sean Owen] Fix scaladoc invocation warning, and enable javac warnings properly, with plugin config updates 007762b [Sean Owen] Remove dead scaladoc links b8ff8cb [Sean Owen] Replace deprecated Ant <tasks> with <target>
*	Merge pull request #567 from ScrapCodes/style2.	Prashant Sharma	2014-02-09	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	SPARK-1058, Fix Style Errors and Add Scala Style to Spark Build. Pt 2 Continuation of PR #557 With this all scala style errors are fixed across the code base !! The reason for creating a separate PR was to not interrupt an already reviewed and ready to merge PR. Hope this gets reviewed soon and merged too. Author: Prashant Sharma <prashant.s@imaginea.com> Closes #567 and squashes the following commits: 3b1ec30 [Prashant Sharma] scala style fixes
*	Merge pull request #557 from ScrapCodes/style. Closes #557.	Patrick Wendell	2014-02-09	1	-23/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	SPARK-1058, Fix Style Errors and Add Scala Style to Spark Build. Author: Patrick Wendell <pwendell@gmail.com> Author: Prashant Sharma <scrapcodes@gmail.com> == Merge branch commits == commit 1a8bd1c059b842cb95cc246aaea74a79fec684f4 Author: Prashant Sharma <scrapcodes@gmail.com> Date: Sun Feb 9 17:39:07 2014 +0530 scala style fixes commit f91709887a8e0b608c5c2b282db19b8a44d53a43 Author: Patrick Wendell <pwendell@gmail.com> Date: Fri Jan 24 11:22:53 2014 -0800 Adding scalastyle snapshot
*	Removing mentions in tests	Patrick Wendell	2014-01-12	1	-1/+0
\|
*	Move some classes to more appropriate packages:	Matei Zaharia	2013-09-01	1	-1/+1
\| \| \| \| \| \|	* RDD, RDDFunctions -> org.apache.spark.rdd Utils, ClosureCleaner, SizeEstimator -> org.apache.spark.util * JavaSerializer, KryoSerializer -> org.apache.spark.serializer
*	Initial work to rename package to org.apache.spark	Matei Zaharia	2013-09-01	2	-24/+21
\|
*	Change build and run instructions to use assemblies	Matei Zaharia	2013-08-29	3	-447/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit makes Spark invocation saner by using an assembly JAR to find all of Spark's dependencies instead of adding all the JARs in lib_managed. It also packages the examples into an assembly and uses that as SPARK_EXAMPLES_JAR. Finally, it replaces the old "run" script with two better-named scripts: "run-examples" for examples, and "spark-class" for Spark internal classes (e.g. REPL, master, etc). This is also designed to minimize the confusion people have in trying to use "run" to run their own classes; it's not meant to do that, but now at least if they look at it, they can modify run-examples to do a decent job for them. As part of this, Bagel's examples are also now properly moved to the examples package instead of bagel.
*	Add Apache license headers and LICENSE and NOTICE files	Matei Zaharia	2013-07-16	6	-1/+103
\|
*	Attempt to fix streaming test failures after yarn branch merge	Mridul Muralidharan	2013-04-28	1	-0/+1
\|
*	Fix passing of superstep in Bagel to avoid seeing new values of the	Matei Zaharia	2013-04-08	1	-3/+3
\| \| \| \| \|	superstep value upon recomputation, and set the default storage level in Bagel to MEMORY_AND_DISK
*	Fix doc style	Nick Pentreath	2013-03-11	1	-7/+13
\|
*	Adding test for non-default persistence level	Nick Pentreath	2013-03-09	1	-0/+18
\|
*	Added choice of persitance level to Bagel. Also added documentation.	Nick Pentreath	2013-03-09	1	-8/+83
\|
*	Renamed "splits" to "partitions"	Matei Zaharia	2013-02-17	3	-14/+14
\|
*	Formatting fixes	Matei Zaharia	2013-02-11	1	-13/+9
\|
*	Fixed an exponential recursion that could happen with doCheckpoint due	Matei Zaharia	2013-02-11	1	-8/+27
\| \| \| \|	to lack of memoization
*	Replace old 'master' term with 'driver'.	Stephen Haberman	2013-01-25	1	-1/+1
\|
*	Changed locations for unit test logs.	Tathagata Das	2013-01-07	1	-2/+2
\|
*	Some doc fixes, including showing version number in nav bar again	Matei Zaharia	2012-10-13	1	-0/+5
\|
*	More doc updates, and moved Serializer to a subpackage.	Matei Zaharia	2012-10-12	1	-5/+6
\|
*	Removed the need to sleep in tests due to waiting for Akka to shut down	Matei Zaharia	2012-10-07	1	-0/+2
\|
*	Write all unit test output to a file	Matei Zaharia	2012-10-01	1	-4/+6
\|
*	Changed the way tasks' dependency files are sent to workers so that	Matei Zaharia	2012-09-28	1	-1/+4
\| \| \| \|	custom serializers or Kryo registrators can be loaded.
*	Set log level in tests to WARN	Matei Zaharia	2012-08-23	1	-0/+8
\|
*	Fix further issues with tests and broadcast.	Matei Zaharia	2012-08-23	1	-1/+4
\| \| \| \| \|	The broadcast fix is to store values as MEMORY_ONLY_DESER instead of MEMORY_ONLY, which will save substantial time on serialization.
*	Stlystic changes	Denny	2012-07-23	1	-2/+2
\| \| \| \| \| \|	Conflicts: core/src/test/scala/spark/MesosSchedulerSuite.scala
*	Always destroy SparkContext in after block for the unit tests.	Denny	2012-07-23	1	-6/+11
\| \| \| \| \| \|	Conflicts: core/src/test/scala/spark/ShuffleSuite.scala
*	Merge branch 'master' into dev	Matei Zaharia	2012-06-15	2	-3/+1
\|\
\| *	Performance improvements to shuffle operations: in particular, preserve	Matei Zaharia	2012-06-09	2	-3/+1
\| \| \| \| \| \| \| \| \| \|	RDD partitioning in more cases where it's possible, and use iterators instead of materializing collections when doing joins.
* \|	Merge in engine improvements from the Spark Streaming project, developed	Matei Zaharia	2012-06-07	1	-5/+6
\|/ \| \| \| \| \|	jointly with Tathagata Das and Haoyuan Li. This commit imports the changes and ports them to Mesos 0.9, but does not yet pass unit tests due to various classes not supporting a graceful stop() yet.
*	Added an option (spark.closure.serializer) to specify the serializer for	Reynold Xin	2012-04-09	1	-0/+4
\| \| \| \|	closures. This enables using Kryo as the closure serializer.
*	Update Bagel unit tests to reflect API change	Ankur Dave	2011-11-08	1	-23/+21
\|
*	Implement standalone WikipediaPageRank with custom serializer	Ankur Dave	2011-10-09	1	-0/+198
\|
*	Update WikipediaPageRank to reflect Bagel API changes	Ankur Dave	2011-10-09	2	-100/+129
\|
*	Remove ShortestPath for now	Ankur Dave	2011-10-09	1	-95/+0
\|
*	Simplify and genericize type parameters in Bagel	Ankur Dave	2011-10-09	1	-85/+129
\|
*	Fix issue #65: Change @serializable to extends Serializable in 2.9 branch	Ismael Juma	2011-08-02	4	-23/+18
\| \| \| \| \| \|	Note that we use scala.Serializable introduced in Scala 2.9 instead of java.io.Serializable. Also, case classes inherit from scala.Serializable by default.
*	Cleaned up a few issues to do with default parallelism levels. Also	Matei Zaharia	2011-07-14	1	-1/+1
\| \| \| \| \|	renamed HadoopFileWriter to HadoopWriter (since it's not only for files) and fixed a bug for lookup().