aboutsummaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* Added some stuff to .gitignoreMatei Zaharia2010-11-131-0/+3
|
* Undo JLine fix that turns out to only be needed when buildr is runningMatei Zaharia2010-11-132-2/+3
| | | | | | | | | | on JRuby. This is quite ugly: JRuby has its own version of JLine which is older than Scala's, and JLine changed API in such a way that code written for the new version won't compile with the old one and vice versa. Sadly, this might be a reason to drop buildr, unless we can package a JRuby with it that uses the right version, or we ask people to only use the C Ruby version of buildr (which doesn't work on OS X right now!)
* Modified project structure to work with buildrMatei Zaharia2010-11-1352-7/+47
|
* Added a shuffle test with negative hash codes for some keys (this was a bug ↵Matei Zaharia2010-11-121-0/+11
| | | | earlier)
* Unit tests for shuffle operations. Fixes #33.Matei Zaharia2010-11-121-0/+119
|
* Added options for using an external HTTP server with LocalFileShuffleMatei Zaharia2010-11-093-19/+41
|
* Removed unnecessary collectAsMapMatei Zaharia2010-11-081-4/+2
|
* Made shuffle algorithm pluggable and added LocalFileShuffle.Matei Zaharia2010-11-086-56/+237
|
* Create output files one by one instead of at the same time in the mapMatei Zaharia2010-11-061-12/+11
| | | | phase of DfsShuffle.
* Merge branch 'matei-shuffle' of github.com:mesos/spark into matei-shuffleMatei Zaharia2010-11-040-0/+0
|\
| * Fixed a small bug in DFS shuffle -- the number of reduce tasks was not being ↵root2010-11-041-1/+2
| | | | | | | | set based on numOutputSplits
* | Properly set the number of output splits in DFS shuffleMatei Zaharia2010-11-041-1/+2
|/
* Added groupBy function in RDDMatei Zaharia2010-11-031-1/+9
|
* Added reduceByKey, groupByKey and join operations based on combine, asMatei Zaharia2010-11-034-56/+115
| | | | | well as versions of the shuffle operations that set the number of splits automatically.
* Fixed a bug with negative hashcodesMatei Zaharia2010-11-031-1/+4
|
* Made DFS shuffle's "reduce tasks" fetch inputs in a random order so theyMatei Zaharia2010-11-032-5/+21
| | | | don't all hit the same nodes at the same time.
* Initial work towards a simple HDFS-based shuffle.Matei Zaharia2010-11-033-1/+155
|
* Made alltests write test output as XML in build/test_resultsMatei Zaharia2010-11-021-1/+6
|
* 'Running on Mesos' test is now only run when MESOS_HOME is setMatei Zaharia2010-11-021-4/+2
|
* Added initial attempt at a BoundedMemoryCacheMatei Zaharia2010-10-241-0/+69
|
* Added SizeEstimator class for use by cachesMatei Zaharia2010-10-241-0/+160
|
* Made caching pluggable and added soft reference and weak reference caches.Matei Zaharia2010-10-237-12/+98
|
* Renamed aggregateSplit() to splitRdd(), plus some style fixesMatei Zaharia2010-10-231-7/+14
|
* Fixed a bug with scheduling of tasks that have no locality preferences.Matei Zaharia2010-10-191-9/+26
| | | | | | These tasks were being subjected to delay scheduling but then counted as having been launched on a preferred node. The solution is to have a separate queue for them and treat them as preferred during scheduling.
* Undid some changes that Mosharaf inadvertedly committed to master.Matei Zaharia2010-10-193-3/+2
|
* Merge branch 'master' of git@github.com:mesos/sparkMosharaf Chowdhury2010-10-1823-508/+922
|\ | | | | | | | | | | | | Conflicts: src/scala/spark/SparkContext.scala Using the latest one from Matei.
| * Less hacky way of preventing config files from being overwritten when a ↵Matei Zaharia2010-10-161-2/+2
| | | | | | | | template file changes
| * Changed the config files that were included in git to templates whichMatei Zaharia2010-10-164-3/+10
| | | | | | | | | | | | are used to create an initial copy of each config file if the user does not have one. This way, users won't accidentally commit their changes to config files to git.
| * Fixed some whitespaceMatei Zaharia2010-10-163-14/+14
| |
| * Added support for generic Hadoop InputFormats and refactored textFile toMatei Zaharia2010-10-162-28/+111
| | | | | | | | use this. Closes #12.
| * Renamed HdfsFile to HadoopFileMatei Zaharia2010-10-162-8/+9
| |
| * Simplified UnionRDD slightly and added a SparkContext.union method for ↵Matei Zaharia2010-10-162-28/+22
| | | | | | | | efficiently union-ing a large number of RDDs
| * Removed setSparkHome method on SparkContext in favor of having anMatei Zaharia2010-10-162-16/+7
| | | | | | | | | | optional constructor parameter, so that the scheduler is guaranteed that a Spark home has been set when it first builds its executor arg.
| * Added the ability to specify a list of JAR files when creating aMatei Zaharia2010-10-167-117/+247
| | | | | | | | SparkContext and have the master node serve those to workers.
| * Set absolute path for SPARK_HOMEMatei Zaharia2010-10-162-2/+2
| |
| * Keep track of tasks in each job so that they can be removed when the job exitsMatei Zaharia2010-10-161-6/+12
| |
| * Further clarified some codeMatei Zaharia2010-10-162-10/+22
| |
| * Fixed some log messagesMatei Zaharia2010-10-161-2/+2
| |
| * Bug fixes and improvements for MesosScheduler and SimpleJobMatei Zaharia2010-10-163-25/+46
| |
| * Moved Spark home detection to SparkContext and added a setSparkHomeMatei Zaharia2010-10-162-51/+81
| | | | | | | | method for setting it programatically.
| * Bug fix in passing env vars to executorsMatei Zaharia2010-10-161-1/+1
| |
| * Added code so that Spark jobs can be launched from outside the SparkMatei Zaharia2010-10-152-6/+39
| | | | | | | | | | | | directory by setting SPARK_HOME and locating the executor relative to that. Entries on SPARK_CLASSPATH and SPARK_LIBRARY_PATH are also passed along to worker nodes.
| * Moved ClassServer out of repl packaged and renamed it to HttpServer.Matei Zaharia2010-10-152-12/+12
| |
| * Increased default memory for alltestsMatei Zaharia2010-10-151-0/+3
| |
| * Abort jobs if a task fails more than a limited number of timesMatei Zaharia2010-10-153-23/+44
| |
| * Updated guava to version r07Matei Zaharia2010-10-156-2/+2
| |
| * A couple of improvements to ReplSuite:Matei Zaharia2010-10-151-26/+30
| | | | | | | | | | - Use collect instead of toArray - Disable the "running on Mesos" test when MESOS_HOME is not set
| * Made locality scheduling constant-time and added support for changingMatei Zaharia2010-10-151-24/+79
| | | | | | | | CPU and memory requested per task.
| * Moved Job and SimpleJob to new filesMatei Zaharia2010-10-073-183/+206
| |
| * Merge branch 'master' into matei-schedulingMatei Zaharia2010-10-074-11/+23
| |\