aboutsummaryrefslogtreecommitdiff
path: root/core
Commit message (Collapse)AuthorAgeFilesLines
* Renamed 'priority' to 'jobId' and assorted minor changesMark Hamstra2013-08-205-59/+60
|
* Merge pull request #828 from mateiz/sched-improvementsMatei Zaharia2013-08-1941-965/+1034
|\ | | | | Scheduler fixes and improvements
| * Added unit tests for ClusterTaskSetManager, and fix a bug found withMatei Zaharia2013-08-1811-28/+396
| | | | | | | | resetting locality level after a non-local launch
| * Added some comments on threading in scheduler codeMatei Zaharia2013-08-183-6/+35
| |
| * Address some review comments:Matei Zaharia2013-08-186-21/+40
| | | | | | | | | | | | | | | | | | | | - When a resourceOffers() call has multiple offers, force the TaskSets to consider them in increasing order of locality levels so that they get a chance to launch stuff locally across all offers - Simplify ClusterScheduler.prioritizeContainers - Add docs on the new configuration options
| * Comment cleanup (via Kay) and some debug messagesMatei Zaharia2013-08-184-23/+16
| |
| * More scheduling fixes:Matei Zaharia2013-08-1811-190/+117
| | | | | | | | | | | | | | | | | | | | | | | | - Added periodic revival of offers in StandaloneSchedulerBackend - Replaced task scheduling aggression with multi-level delay scheduling in ClusterTaskSetManager - Fixed ZippedRDD preferred locations because they can't currently be process-local - Fixed some uses of hostPort
| * Initial work towards scheduler refactoring:Matei Zaharia2013-08-1827-751/+484
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Replace use of hostPort vs host in Task.preferredLocations with a TaskLocation class that contains either an executorId and a host or just a host. This is part of a bigger effort to eliminate hostPort based data structures and just use executorID, since the hostPort vs host stuff is confusing (and not checkable with static typing, leading to ugly debug code), and hostPorts are not provided by Mesos. - Replaced most hostPort-based data structures and fields as above. - Simplified ClusterTaskSetManager to deal with preferred locations in a more concise way and generally be more concise. - Updated the way ClusterTaskSetManager handles racks: instead of enqueueing a task to a separate queue for all the hosts in the rack, which would create lots of large queues, have one queue per rack name. - Removed non-local fallback stuff in ClusterScheduler that tried to launch less-local tasks on a node once the local ones were all assigned. This change didn't work because many cluster schedulers send offers for just one node at a time (even the standalone and YARN ones do so as nodes join the cluster one by one). Thus, lots of non-local tasks would be assigned even though a node with locality for them would be able to receive tasks just a short time later. - Renamed MapOutputTracker "generations" to "epochs".
* | Merge pull request #849 from mateiz/web-fixesMatei Zaharia2013-08-192-8/+9
|\ \ | | | | | | Small fixes to web UI
| * | Allow some wiggle room in UISuite port test and in EC2 portsMatei Zaharia2013-08-191-2/+3
| | |
| * | Small fixes to web UI:Matei Zaharia2013-08-192-6/+6
| |/ | | | | | | | | | | - Use SPARK_PUBLIC_DNS environment variable if set (for EC2) - Use a non-ephemeral port (3030 instead of 33000) by default - Updated test to use non-ephemeral port too
* | Merge pull request #847 from rxin/rddMatei Zaharia2013-08-1921-189/+349
|\ \ | |/ |/| Allow subclasses of Product2 in all key-value related classes
| * Code review feedback. (added tests for cogroup and substract; added more ↵Reynold Xin2013-08-193-11/+51
| | | | | | | | documentation on MutablePair)
| * Added a test for sorting using MutablePair's.Reynold Xin2013-08-191-2/+18
| |
| * Made PairRDDFunctions taking only Tuple2, but made the rest of the shuffle ↵Reynold Xin2013-08-1919-91/+132
| | | | | | | | code path working with general Product2.
| * Added the missing RDD files and cleaned up SparkContext.Reynold Xin2013-08-184-12/+126
| |
| * Allow subclasses of Product2 in all key-value related classes ↵Reynold Xin2013-08-1810-107/+56
| | | | | | | | (ShuffleDependency, PairRDDFunctions, etc).
* | Merge pull request #840 from AndreSchumacher/zipeggMatei Zaharia2013-08-181-1/+8
|\ \ | |/ |/| Implementing SPARK-878 for PySpark: adding zip and egg files to context ...
| * Implementing SPARK-878 for PySpark: adding zip and egg files to context and ↵Andre Schumacher2013-08-161-1/+8
| | | | | | | | passing it down to workers which add these to their sys.path
* | Moved shuffle serializer setting from a constructor parameter to a ↵Reynold Xin2013-08-175-32/+51
| | | | | | | | setSerializer method in various RDDs that involve shuffle operations.
* | Removed the mapSideCombine option in partitionBy.Reynold Xin2013-08-172-28/+6
| |
* | Removed the mapSideCombine option in CoGroupedRDD.Reynold Xin2013-08-171-33/+5
| |
* | Removed the unused shuffleId in ShuffleDependency's constructor.Reynold Xin2013-08-161-1/+0
| |
* | Merge pull request #839 from jegonzal/zip_partitionsMatei Zaharia2013-08-164-17/+14
|\ \ | | | | | | Currying RDD.zipPartitions
| * | Reversing the argument order in zipPartitions to enable stronger type inference.Joseph E. Gonzalez2013-08-164-17/+14
| | |
* | | Use the JSON formatter from Scala library and removed dependency on lift-json.Reynold Xin2013-08-156-70/+64
| | | | | | | | | | | | It made the JSON creation slightly more complicated, but reduces one external dependency. The scala library also properly escape "/" (which lift-json doesn't).
* | | Revert "Merge pull request #834 from Daemoen/master"Reynold Xin2013-08-151-2/+1
| | | | | | | | | | | | | | | This reverts commit 230ab2722ebd399afcf64c1a131f4929f602177d, reversing changes made to 659553b21ddd7504889ce113a816c1db4a73f167.
* | | Merge pull request #834 from Daemoen/masterReynold Xin2013-08-151-1/+2
|\ \ \ | |_|/ |/| | Updated json output to allow for display of worker state
| * | Updated json output to allow for display of worker stateDaemoen2013-08-151-1/+2
| | | | | | | | | Ops teams need to ensure that the cluster is functional and performant. Having to scrape the html source for worker state won't work reliably, and will be slow. By exposing the state in the json output, ops teams are able to ensure a fully functional environment by querying for the json output and parsing for dead nodes.
* | | Merge pull request #836 from pwendell/renamePatrick Wendell2013-08-1519-64/+64
|\ \ \ | |_|/ |/| | Rename `memoryBytesToString` and `memoryMegabytesToString`
| * | Rename `memoryBytesToString` and `memoryMegabytesToString`Patrick Wendell2013-08-1519-64/+64
| | | | | | | | | | | | | | | | | | | | | These are used all over the place now and they are not specific to memory at all. memoryBytesToString --> bytesToString memoryMegabytesToString --> megabytesToString
* | | More minor UI changes including code review feedback.Reynold Xin2013-08-156-16/+39
| | |
* | | Various UI improvements.Reynold Xin2013-08-1412-88/+83
| |/ |/|
* | Renamed setCurrentJobDescription to setJobDescription.Reynold Xin2013-08-141-1/+1
| |
* | A few small scheduler / job description changes.Reynold Xin2013-08-144-70/+74
|/ | | | | | | | 1. Renamed SparkContext.addLocalProperty to setLocalProperty. And allow this function to unset a property. 2. Renamed SparkContext.setDescription to setCurrentJobDescription. 3. Throw an exception if the fair scheduler allocation file is invalid.
* Merge pull request #822 from pwendell/ui-featuresMatei Zaharia2013-08-146-27/+54
|\ | | | | Adding GC Stats to TaskMetrics (and three small fixes)
| * Style cleanup based on Matei feedbackPatrick Wendell2013-08-143-5/+4
| |
| * Small style clean-upPatrick Wendell2013-08-132-2/+2
| |
| * Correcting terminology in RDD pagePatrick Wendell2013-08-131-1/+1
| |
| * Correct sorting order for stagesPatrick Wendell2013-08-132-10/+6
| |
| * Capturing GC detials in TaskMetricsPatrick Wendell2013-08-134-10/+37
| |
| * Bug fix for display of shuffle read/write metrics.Patrick Wendell2013-08-131-6/+11
| | | | | | | | | | This fixes an error where empty cells are missing if a given task has no shuffle read/write.
* | Fixed 2 bugs in executor UI.Kay Ousterhout2013-08-131-12/+10
| | | | | | | | | | | | 1) UI crashed if the executor UI was loaded before any tasks started. 2) The total tasks was incorrectly reported due to using string (rather than int) arithmetic.
* | Merge pull request #821 from pwendell/print-launch-commandMatei Zaharia2013-08-131-1/+1
|\ \ | | | | | | Print run command to stderr rather than stdout
| * | Print run command to stderr rather than stdoutPatrick Wendell2013-08-131-1/+1
| | |
* | | Reuse the set of failed states rather than creating a new object each timeKay Ousterhout2013-08-131-1/+3
| | |
* | | Properly account for killed tasks.Kay Ousterhout2013-08-131-1/+1
| |/ |/| | | | | | | | | | | The TaskState class's isFinished() method didn't return true for KILLED tasks, which means some resources are never reclaimed for tasks that are killed. This also made it inconsistent with the isFinished() method used by CoarseMesosSchedulerBackend.
* | Slight change to pr-784Patrick Wendell2013-08-135-9/+10
| |
* | Merge pull request #784 from jerryshao/dev-metrics-servletPatrick Wendell2013-08-1314-35/+157
|\ \ | | | | | | Add MetricsServlet for Spark metrics system
| * | MetricsServlet code refactor according to commentsjerryshao2013-08-1211-43/+35
| | |