aboutsummaryrefslogtreecommitdiff
path: root/core/src
Commit message (Collapse)AuthorAgeFilesLines
* Refactored RDD checkpointing to minimize extra fields in RDD class.Tathagata Das2012-12-0411-191/+140
|
* Added metadata cleaner to HttpBroadcast to clean up old broacast files.Tathagata Das2012-12-031-0/+24
|
* Made RDD checkpoint not create a new thread. Fixed bug in detecting when ↵Tathagata Das2012-12-022-22/+12
| | | | spark.cleaner.delay is insufficient.
* Minor modifications.Tathagata Das2012-12-011-1/+6
|
* Added TimeStampedHashSet and used that to cleanup the list of registered RDD ↵Tathagata Das2012-11-293-9/+81
| | | | IDs in CacheTracker.
* Added metadata cleaner to BlockManager to remove old blocks completely.Tathagata Das2012-11-282-12/+36
|
* Renamed CleanupTask to MetadataCleaner.Tathagata Das2012-11-285-14/+15
|
* Modified StorageLevel and BlockManagerId to cache common objects and use ↵Tathagata Das2012-11-284-29/+101
| | | | cached object while deserializing.
* Bug fixesTathagata Das2012-11-282-9/+19
|
* Modified bunch HashMaps in Spark to use TimeStampedHashMap and made various ↵Tathagata Das2012-11-276-14/+156
| | | | modules use CleanupTask to periodically clean up metadata.
* Merged branch mesos/master to branch dev.Tathagata Das2012-11-2627-76/+313
|\
| * Merge pull request #304 from mbautin/configurable_local_ipMatei Zaharia2012-11-191-1/+7
| |\ | | | | | | SPARK-624: make the default local IP customizable
| | * Addressing Matei's comment: SPARK_LOCAL_IP environment variablembautin2012-11-191-1/+1
| | |
| | * SPARK-624: make the default local IP customizablembautin2012-11-151-1/+7
| | |
| * | Set default uncaught exception handler to exit.Charles Reiss2012-11-162-1/+15
| |/ | | | | | | | | | | Among other things, should prevent OutOfMemoryErrors in some daemon threads (such as the network manager) from causing a spark executor to enter a state where it cannot make progress but does not report an error.
| * Use DNS names instead of IP addresses in standalone mode, to allowMatei Zaharia2012-11-152-4/+4
| | | | | | | | matching with data locality hints from storage systems.
| * Detect correctly when one has disconnected from a standalone cluster.Matei Zaharia2012-11-111-1/+13
| | | | | | | | SPARK-617 #resolve
| * Fix K-means example a littleroot2012-11-101-1/+2
| |
| * Incorporated Matei's suggestions. Tested with 5 producer(consumer) threads ↵Tathagata Das2012-11-092-4/+18
| | | | | | | | each doing 50k puts (gets), took 15 minutes to run, no errors or deadlocks.
| * Fixed deadlock in BlockManager.Tathagata Das2012-11-093-87/+180
| | | | | | | | | | | | 1. Changed the lock structure of BlockManager by replacing the 337 coarse-grained locks to use BlockInfo objects as per-block fine-grained locks. 2. Changed the MemoryStore lock structure by making the block putting threads lock on a different object (not the memory store) thus making sure putting threads minimally blocks to the getting treads. 3. Added spark.storage.ThreadingTest to stress test the BlockManager using 5 block producer and 5 block consumer threads.
| * Added an option to spread out jobs in the standalone mode.Matei Zaharia2012-11-083-18/+56
| |
| * Fix for connections not being reused (from Josh Rosen)Matei Zaharia2012-11-081-1/+2
| |
| * fix bug in getting slave id out of mesosImran Rashid2012-11-081-1/+1
| |
| * Various fixes to standalone mode and web UI:Matei Zaharia2012-11-0713-45/+110
| | | | | | | | | | | | | | - Don't report a job as finishing multiple times - Don't show state of workers as LOADING when they're running - Show start and finish times in web UI - Sort web UI tables by ID and time by default
| * Made Akka timeout and message frame size configurable, and upped the defaultsMatei Zaharia2012-11-061-2/+5
| |
| * Remove unnecessary hash-map put in MemoryStoreShivaram Venkataraman2012-11-011-3/+0
| |
| * Don't throw an error in the block manager when a block is cached on the ↵root2012-10-261-0/+6
| | | | | | | | | | | | | | | | | | | | master due to a locally computed operation Conflicts: core/src/main/scala/spark/storage/BlockManagerMaster.scala
* | Fixed bug in the number of splits in RDD after checkpointing. Modified ↵Tathagata Das2012-11-191-1/+2
| | | | | | | | reduceByKeyAndWindow (naive) computation from window+reduceByKey to reduceByKey+window+reduceByKey.
* | Fixed checkpointing bug in CoGroupedRDD. CoGroupSplits kept around the RDD ↵Tathagata Das2012-11-172-4/+42
| | | | | | | | splits of its parent RDDs, thus checkpointing its parents did not release the references to the parent splits.
* | Optimized checkpoint writing by reusing FileSystem object. Fixed bug in ↵Tathagata Das2012-11-131-5/+1
| | | | | | | | updating of checkpoint data in DStream where the checkpointed RDDs, upon recovery, were not recognized as checkpointed RDDs and therefore deleted from HDFS. Made InputStreamsSuite more robust to timing delays.
* | Refactored BlockManagerMaster (not BlockManagerMasterActor) to simplify the ↵Tathagata Das2012-11-113-198/+127
| | | | | | | | code and fix live lock problem in unlimited attempts to contact the master. Also added testcases in the BlockManagerSuite to test BlockManagerMaster methods getPeers and getLocations.
* | Fixed deadlock in BlockManager.Tathagata Das2012-11-092-89/+101
| |
* | Fixed major bugs in checkpointing.Tathagata Das2012-11-051-2/+4
| |
* | Made checkpointing of dstream graph to work with checkpointing of RDDs. For ↵Tathagata Das2012-11-042-15/+30
| | | | | | | | streams requiring checkpointing of its RDD, the default checkpoint interval is set to 10 seconds.
* | Added 'synchronized' to RDD serialization to ensure checkpoint-related ↵Tathagata Das2012-10-313-4/+92
| | | | | | | | changes are reflected atomically in the task closure. Added to tests to ensure that jobs running on an RDD on which checkpointing is in progress does hurt the result of the job.
* | Added checkpointing support to all RDDs, along with CheckpointSuite to test ↵Tathagata Das2012-10-3022-107/+352
| | | | | | | | checkpointing in them.
* | Modified RDD API to make dependencies a var (therefore can be changed to ↵Tathagata Das2012-10-2919-107/+149
| | | | | | | | checkpointed hadoop rdd) and othere references to parent RDDs either through dependencies or through a weak reference (to allow finalizing when dependencies do not refer to it any more).
* | Merge remote-tracking branch 'public/master' into devMatei Zaharia2012-10-24193-2097/+5043
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: core/src/main/scala/spark/BlockStoreShuffleFetcher.scala core/src/main/scala/spark/KryoSerializer.scala core/src/main/scala/spark/MapOutputTracker.scala core/src/main/scala/spark/RDD.scala core/src/main/scala/spark/SparkContext.scala core/src/main/scala/spark/executor/Executor.scala core/src/main/scala/spark/network/Connection.scala core/src/main/scala/spark/network/ConnectionManagerTest.scala core/src/main/scala/spark/rdd/BlockRDD.scala core/src/main/scala/spark/rdd/NewHadoopRDD.scala core/src/main/scala/spark/scheduler/ShuffleMapTask.scala core/src/main/scala/spark/scheduler/cluster/StandaloneSchedulerBackend.scala core/src/main/scala/spark/storage/BlockManager.scala core/src/main/scala/spark/storage/BlockMessage.scala core/src/main/scala/spark/storage/BlockStore.scala core/src/main/scala/spark/storage/StorageLevel.scala core/src/main/scala/spark/util/AkkaUtils.scala project/SparkBuild.scala run
| * Strip leading mesos:// in URLs passed to MesosMatei Zaharia2012-10-241-2/+3
| |
| * Merge pull request #281 from rxin/memreportMatei Zaharia2012-10-233-71/+93
| |\ | | | | | | Added a method to report slave memory status; force serialize accumulator update in local mode.
| | * Serialize accumulator updates in TaskResult for local mode.Reynold Xin2012-10-151-4/+5
| | |
| | * Added a method to report slave memory status.Reynold Xin2012-10-142-67/+88
| | |
| * | Merge remote-tracking branch 'JoshRosen/shuffle_refactoring' into devMatei Zaharia2012-10-2313-250/+113
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: core/src/main/scala/spark/Dependency.scala core/src/main/scala/spark/rdd/CoGroupedRDD.scala core/src/main/scala/spark/rdd/ShuffledRDD.scala
| | * | Remove map-side combining from ShuffleMapTask.Josh Rosen2012-10-138-94/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This separation of concerns simplifies the ShuffleDependency and ShuffledRDD interfaces. Map-side combining can be performed in a mapPartitions() call prior to shuffling the RDD. I don't anticipate this having much of a performance impact: in both approaches, each tuple is hashed twice: once in the bucket partitioning and once in the combiner's hashtable. The same steps are being performed, but in a different order and through one extra Iterator.
| | * | Remove mapSideCombine field from Aggregator.Josh Rosen2012-10-135-22/+15
| | | | | | | | | | | | | | | | | | | | Instead, the presence or absense of a ShuffleDependency's aggregator will control whether map-side combining is performed.
| | * | Change ShuffleFetcher to return an Iterator.Josh Rosen2012-10-138-167/+63
| | | |
| | * | Add helper methods to Aggregator.Josh Rosen2012-10-131-1/+32
| | | |
| * | | Support for Hadoop 2 distributions such as cdh4Thomas Dudziak2012-10-187-20/+45
| | |/ | |/|
| * | Made ShuffleDependency automatically find a shuffle ID for itselfMatei Zaharia2012-10-143-5/+6
| | |
| * | Take executor environment vars as an arguemnt to SparkContextMatei Zaharia2012-10-137-79/+107
| |/