aboutsummaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'mesos-streaming' into streamingTathagata Das2013-02-190-0/+0
|\
| * Merge pull request #455 from tdas/streamingTathagata Das2013-02-07183-2473/+4256
| |\ | | | | | | Merging latest master branch changes to the streaming branch
* | \ Merge branch 'streaming' into ScrapCodes-streaming-actorTathagata Das2013-02-19209-3356/+5876
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: docs/plugin-custom-receiver.md streaming/src/main/scala/spark/streaming/StreamingContext.scala streaming/src/main/scala/spark/streaming/dstream/KafkaInputDStream.scala streaming/src/main/scala/spark/streaming/dstream/PluggableInputDStream.scala streaming/src/main/scala/spark/streaming/receivers/ActorReceiver.scala streaming/src/test/scala/spark/streaming/InputStreamsSuite.scala
| * | | Changed networkStream to socketStream and pluggableNetworkStream to become ↵Tathagata Das2013-02-189-35/+36
| | | | | | | | | | | | | | | | networkStream as a way to create streams from arbitrary network receiver.
| * | | Merge branch 'streaming' into ScrapCode-streamingTathagata Das2013-02-18242-3438/+9569
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: streaming/src/main/scala/spark/streaming/dstream/KafkaInputDStream.scala streaming/src/main/scala/spark/streaming/dstream/NetworkInputDStream.scala
| | * | | Added checkpointing and fault-tolerance semantics to the programming guide. ↵Tathagata Das2013-02-188-59/+206
| | | | | | | | | | | | | | | | | | | | Fixed default checkpoint interval to being a multiple of slide duration. Fixed visibility of some classes and objects to clean up docs.
| | * | | Many changes to ensure better 2nd recovery if 2nd failure happens whileTathagata Das2013-02-1718-97/+208
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | recovering from 1st failure - Made the scheduler to checkpoint after clearing old metadata which ensures that a new checkpoint is written as soon as at least one batch gets computed while recovering from a failure. This ensures that if there is a 2nd failure while recovering from 1st failure, the system start 2nd recovery from a newer checkpoint. - Modified Checkpoint writer to write checkpoint in a different thread. - Added a check to make sure that compute for InputDStreams gets called only for strictly increasing times. - Changed implementation of slice to call getOrCompute on parent DStream in time-increasing order. - Added testcase to test slice. - Fixed testGroupByKeyAndWindow testcase in JavaAPISuite to verify results with expected output in an order-independent manner.
| | * | | Made MasterFailureTest more robust.Tathagata Das2013-02-151-4/+22
| | | | |
| | * | | Moved Java streaming examples to examples/src/main/java/spark/streaming/... ↵Tathagata Das2013-02-144-1/+1
| | | | | | | | | | | | | | | | | | | | and fixed logging in NetworkInputTracker to highlight errors when receiver deregisters/shuts down.
| | * | | Added TwitterInputDStream from example to StreamingContext. Renamed example ↵Tathagata Das2013-02-144-45/+53
| | | | | | | | | | | | | | | | | | | | TwitterBasic to TwitterPopularTags.
| | * | | Removed countByKeyAndWindow on paired DStreams, and added ↵Tathagata Das2013-02-149-189/+231
| | | | | | | | | | | | | | | | | | | | countByValueAndWindow for all DStreams. Updated both scala and java API and testsuites.
| | * | | Changes functions comments to make them more consistent.Tathagata Das2013-02-132-45/+45
| | | | |
| | * | | Added filter functionality to reduceByKeyAndWindow with inverse. ↵Tathagata Das2013-02-137-81/+102
| | | | | | | | | | | | | | | | | | | | Consolidated reduceByKeyAndWindow's many functions into smaller number of functions with optional parameters.
| | * | | Changed scheduler and file input stream to fix bugs in the driver fault ↵Tathagata Das2013-02-1318-452/+693
| | | | | | | | | | | | | | | | | | | | tolerance. Added MasterFailureTest to rigorously test master fault tolerance with file input stream.
| | * | | Fixed bugs in FileInputDStream and Scheduler that occasionally failed to ↵Tathagata Das2013-02-106-91/+221
| | | | | | | | | | | | | | | | | | | | reprocess old files after recovering from master failure. Completely modified spark.streaming.FailureTest to test multiple master failures using file input stream.
| | * | | Fixed bug in CheckpointRDD to prevent exception when the original RDD had ↵Tathagata Das2013-02-102-2/+12
| | | | | | | | | | | | | | | | | | | | zero splits.
| | * | | Added an initial spark job to ensure worker nodes are initialized.Tathagata Das2013-02-092-2/+7
| | | |/ | | |/|
| | * | Merge branch 'mesos-master' into streamingTathagata Das2013-02-07177-2220/+3876
| | |\ \
| | | * \ Merge pull request #450 from stephenh/inlinemergepairMatei Zaharia2013-02-051-6/+4
| | | |\ \ | | | | | | | | | | | | Inline mergePair to look more like the narrow dep branch.
| | | | * | Inline mergePair to look more like the narrow dep branch.Stephen Haberman2013-02-051-6/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | No functionality changes, I think this is just more consistent given mergePair isn't called multiple times/recursive. Also added a comment to explain the usual case of having two parent RDDs.
| | | * | | Merge pull request #451 from stephenh/fixdeathpactexceptionMatei Zaharia2013-02-052-19/+13
| | | |\ \ \ | | | | | | | | | | | | | | Handle Terminated to avoid endless DeathPactExceptions.
| | | | * \ \ Merge branch 'master' into fixdeathpactexceptionStephen Haberman2013-02-0530-258/+991
| | | | |\ \ \ | | | | |/ / / | | | |/| | | | | | | | | | | | | | | | | Conflicts: core/src/main/scala/spark/deploy/worker/Worker.scala
| | | * | | | Merge pull request #449 from stephenh/longerdriversuiteMatei Zaharia2013-02-051-1/+2
| | | |\ \ \ \ | | | | | | | | | | | | | | | | Increase DriverSuite timeout.
| | | | * | | | Increase DriverSuite timeout.Stephen Haberman2013-02-051-1/+2
| | | | | |/ / | | | | |/| |
| | | * | | | Merge pull request #447 from pwendell/streaming-constructorMatei Zaharia2013-02-051-0/+8
| | | |\ \ \ \ | | | | | | | | | | | | | | | | Streaming constructor which takes JavaSparkContext
| | | | * | | | Streaming constructor which takes JavaSparkContextPatrick Wendell2013-02-051-0/+8
| | | |/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | It's sometimes helpful to directly pass a JavaSparkContext, and take advantage of the various constructors available for that.
| | | * | | | Small fix to test for distinctMatei Zaharia2013-02-041-1/+1
| | | | | | |
| | | * | | | Fix failing testMatei Zaharia2013-02-041-2/+1
| | | | | | |
| | | * | | | Merge pull request #445 from JoshRosen/pyspark_fixesMatei Zaharia2013-02-035-22/+19
| | | |\ \ \ \ | | | | | | | | | | | | | | | | Fix exit status in PySpark unit tests; fix/optimize PySpark's RDD.take()
| | | | * | | | Remove unnecessary doctest __main__ methods.Josh Rosen2013-02-032-18/+0
| | | | | | | |
| | | | * | | | Fetch fewer objects in PySpark's take() method.Josh Rosen2013-02-032-2/+13
| | | | | | | |
| | | | * | | | Fix reporting of PySpark doctest failures.Josh Rosen2013-02-032-2/+6
| | | | |/ / /
| | | * | | | Merge pull request #379 from stephenh/sparkmemMatei Zaharia2013-02-026-35/+17
| | | |\ \ \ \ | | | | | | | | | | | | | | | | Add spark.executor.memory to differentiate executor memory from spark-shell
| | | | * | | | Fix dangling old variable names.Stephen Haberman2013-02-021-2/+2
| | | | | | | |
| | | | * | | | Move executorMemory up into SchedulerBackend.Stephen Haberman2013-02-024-29/+12
| | | | | | | |
| | | | * | | | Merge branch 'master' into sparkmemStephen Haberman2013-02-02254-2124/+13947
| | | | |\| | |
| | | | * | | | Fix SPARK_MEM in ExecutorRunner.Stephen Haberman2013-01-221-1/+1
| | | | | | | |
| | | | * | | | Restore SPARK_MEM in executorEnvs.Stephen Haberman2013-01-221-2/+3
| | | | | | | |
| | | | * | | | Add spark.executor.memory to differentiate executor memory from spark-shell ↵Stephen Haberman2013-01-153-10/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | memory.
| | | * | | | | Merge pull request #422 from squito/blockmanager_infoMatei Zaharia2013-02-026-38/+59
| | | |\ \ \ \ \ | | | | | | | | | | | | | | | | | | RDDInfo available from SparkContext
| | | | * | | | | remove unneeded (and unused) filter on block infoImran Rashid2013-02-011-2/+0
| | | | | | | | |
| | | | * | | | | track total partitions, in addition to cached partitions; use scala string ↵Imran Rashid2013-02-013-9/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | formatting
| | | | * | | | | fixup merge (master -> driver renaming)Imran Rashid2013-02-011-1/+1
| | | | | | | | |
| | | | * | | | | Merge branch 'master' into blockmanager_infoImran Rashid2013-01-3043-246/+291
| | | | |\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: core/src/main/scala/spark/storage/BlockManagerMaster.scala
| | | | * | | | | | rename Slaves --> ExecutorImran Rashid2013-01-302-5/+5
| | | | | | | | | |
| | | | * | | | | | Merge branch 'master' into blockmanager_infoImran Rashid2013-01-2923-192/+207
| | | | |\ \ \ \ \ \
| | | | * | | | | | | better formatting for RDDInfoImran Rashid2013-01-281-3/+9
| | | | | | | | | | |
| | | | * | | | | | | expose RDD & storage info directly via SparkContextImran Rashid2013-01-284-28/+41
| | | | | | | | | | |
| | | * | | | | | | | Merge pull request #436 from stephenh/removeextraloopMatei Zaharia2013-02-021-13/+10
| | | |\ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | Once we find a split with no block, we don't have to look for more.
| | | | * | | | | | | | Further simplify checking for Nil.Stephen Haberman2013-02-021-3/+1
| | | | | | | | | | | |