aboutsummaryrefslogtreecommitdiff
path: root/examples/src/main
Commit message (Collapse)AuthorAgeFilesLines
* Updated java API docs for streaming, along with very minor changes in the ↵Tathagata Das2014-01-162-3/+2
| | | | code examples.
* Add missing header filesPatrick Wendell2014-01-141-0/+17
|
* Merge remote-tracking branch 'apache/master' into filestream-fixTathagata Das2014-01-131-0/+49
|\ | | | | | | | | Conflicts: streaming/src/main/scala/org/apache/spark/streaming/dstream/DStream.scala
| * Merge branch 'master' into graphxReynold Xin2014-01-1328-61/+271
| |\
| * | Add LiveJournalPageRank exampleAnkur Dave2014-01-131-0/+49
| | |
| * | Revert changes to examples/.../PageRankUtils.scalaAnkur Dave2014-01-091-3/+3
| | | | | | | | | | | | Reverts to 04d83fc37f9eef89c20331c85291a0a169f75e6d:examples/src/main/scala/org/apache/spark/examples/bagel/PageRankUtils.scala.
| * | Merge remote-tracking branch 'spark-upstream/master' into HEADAnkur Dave2014-01-0847-186/+255
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: README.md core/src/main/scala/org/apache/spark/util/collection/OpenHashMap.scala core/src/main/scala/org/apache/spark/util/collection/OpenHashSet.scala core/src/main/scala/org/apache/spark/util/collection/PrimitiveKeyOpenHashMap.scala pom.xml project/SparkBuild.scala repl/src/main/scala/org/apache/spark/repl/SparkILoop.scala
| * \ \ Merge branch 'master' of github.com:apache/incubator-sparkReynold Xin2013-11-256-16/+19
| |\ \ \
| * \ \ \ Merge branch 'master' of github.com:apache/incubator-spark into mergemergeReynold Xin2013-11-041-1/+2
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: README.md core/src/main/scala/org/apache/spark/util/collection/OpenHashMap.scala core/src/main/scala/org/apache/spark/util/collection/OpenHashSet.scala core/src/main/scala/org/apache/spark/util/collection/PrimitiveKeyOpenHashMap.scala
| * \ \ \ \ Merge remote-tracking branch 'spark-upstream/master'Ankur Dave2013-10-305-17/+233
| |\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: project/SparkBuild.scala
| * \ \ \ \ \ Merge branch 'master' of https://github.com/apache/incubator-spark into ↵Joseph E. Gonzalez2013-10-181-4/+9
| |\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | indexedrdd_graphx
| * \ \ \ \ \ \ merged with upstream changesJoseph E. Gonzalez2013-10-141-2/+0
| |\ \ \ \ \ \ \
| * \ \ \ \ \ \ \ Merging latest changes from spark main branchJoseph E. Gonzalez2013-09-1752-179/+1054
| |\ \ \ \ \ \ \ \
* | | | | | | | | | Removed StreamingContext.registerInputStream and registerOutputStream - they ↵Tathagata Das2014-01-131-1/+2
| |_|_|_|_|_|_|_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | were useless as InputDStream has been made to register itself. Also made DStream.register() private[streaming] - not useful to expose the confusing function. Updated a lot of documentation.
* | | | | | | | | Merge pull request #400 from tdas/dstream-movePatrick Wendell2014-01-131-1/+1
|\ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Moved DStream and PairDSream to org.apache.spark.streaming.dstream Similar to the package location of `org.apache.spark.rdd.RDD`, `DStream` has been moved from `org.apache.spark.streaming.DStream` to `org.apache.spark.streaming.dstream.DStream`. I know that the package name is a little long, but I think its better to keep it consistent with Spark's structure. Also fixed persistence of windowed DStream. The RDDs generated generated by windowed DStream are essentially unions of underlying RDDs, and persistent these union RDDs would store numerous copies of the underlying data. Instead setting the persistence level on the windowed DStream is made to set the persistence level of the underlying DStream.
| * \ \ \ \ \ \ \ \ Merge remote-tracking branch 'apache/master' into dstream-moveTathagata Das2014-01-126-9/+9
| |\ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: streaming/src/main/scala/org/apache/spark/streaming/dstream/DStream.scala
* | \ \ \ \ \ \ \ \ \ Merge pull request #395 from hsaputra/remove_simpleredundantreturn_scalaPatrick Wendell2014-01-127-17/+17
|\ \ \ \ \ \ \ \ \ \ \ | |_|/ / / / / / / / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove simple redundant return statements for Scala methods/functions Remove simple redundant return statements for Scala methods/functions: -) Only change simple return statements at the end of method -) Ignore the complex if-else check -) Ignore the ones inside synchronized -) Add small changes to making var to val if possible and remove () for simple get This hopefully makes the review simpler =) Pass compile and tests.
| * | | | | | | | | | Merge branch 'master' into remove_simpleredundantreturn_scalaHenry Saputra2014-01-1221-36/+246
| |\| | | | | | | | |
| * | | | | | | | | | Remove simple redundant return statement for Scala methods/functions:Henry Saputra2014-01-127-17/+17
| | |_|_|_|_|_|_|_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | -) Only change simple return statements at the end of method -) Ignore the complex if-else check -) Ignore the ones inside synchronized
* | | | | | | | | | Rename DStream.foreach to DStream.foreachRDDPatrick Wendell2014-01-125-8/+8
| |/ / / / / / / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | `foreachRDD` makes it clear that the granularity of this operator is per-RDD. As it stands, `foreach` is inconsistent with with `map`, `filter`, and the other DStream operators which get pushed down to individual records within each RDD.
* | | | | | | | | Merge remote-tracking branch 'apache/master' into driver-testTathagata Das2014-01-1019-30/+76
|\ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: streaming/src/main/scala/org/apache/spark/streaming/DStreamGraph.scala
| * \ \ \ \ \ \ \ \ Merge pull request #363 from pwendell/streaming-logsPatrick Wendell2014-01-0919-30/+76
| |\ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Set default logging to WARN for Spark streaming examples. This programatically sets the log level to WARN by default for streaming tests. If the user has already specified a log4j.properties file, the user's file will take precedence over this default.
| | * | | | | | | | | Minor clean-upPatrick Wendell2014-01-091-1/+1
| | | | | | | | | | |
| | * | | | | | | | | Set default logging to WARN for Spark streaming examples.Patrick Wendell2014-01-0919-29/+75
| | |/ / / / / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This programatically sets the log level to WARN by default for streaming tests. If the user has already specified a log4j.properties file, the user's file will take precedence over this default.
* | | | | | | | | | Updated docs based on Patrick's comments in PR 383.Tathagata Das2014-01-102-12/+40
| | | | | | | | | |
* | | | | | | | | | Merge branch 'standalone-driver' into driver-testTathagata Das2014-01-0947-175/+243
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: core/src/main/scala/org/apache/spark/SparkContext.scala core/src/main/scala/org/apache/spark/deploy/worker/DriverRunner.scala examples/src/main/java/org/apache/spark/streaming/examples/JavaNetworkWordCount.java streaming/src/main/scala/org/apache/spark/streaming/Checkpoint.scala streaming/src/main/scala/org/apache/spark/streaming/StreamingContext.scala streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaStreamingContext.scala streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobGenerator.scala
| * | | | | | | | | Merge remote-tracking branch 'apache-github/master' into standalone-driverPatrick Wendell2014-01-0827-122/+179
| |\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: core/src/test/scala/org/apache/spark/deploy/JsonProtocolSuite.scala pom.xml
| | * | | | | | | | Merge pull request #313 from tdas/project-refactorPatrick Wendell2014-01-079-25/+32
| | |\ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Refactored the streaming project to separate external libraries like Twitter, Kafka, Flume, etc. At a high level, these are the following changes. 1. All the external code was put in `SPARK_HOME/external/` as separate SBT projects and Maven modules. Their artifact names are `spark-streaming-twitter`, `spark-streaming-kafka`, etc. Both SparkBuild.scala and pom.xml files have been updated. References to external libraries and repositories have been removed from the settings of root and streaming projects/modules. 2. To avail the external functionality (say, creating a Twitter stream), the developer has to `import org.apache.spark.streaming.twitter._` . For Scala API, the developer has to call `TwitterUtils.createStream(streamingContext, ...)`. For the Java API, the developer has to call `TwitterUtils.createStream(javaStreamingContext, ...)`. 3. Each external project has its own scala and java unit tests. Note the unit tests of each external library use classes of the streaming unit tests (`TestSuiteBase`, `LocalJavaStreamingContext`, etc.). To enable this code sharing among test classes, `dependsOn(streaming % "compile->compile,test->test")` was used in the SparkBuild.scala . In the streaming/pom.xml, an additional `maven-jar-plugin` was necessary to capture this dependency (see comment inside the pom.xml for more information). 4. Jars of the external projects have been added to examples project but not to the assembly project. 5. In some files, imports have been rearrange to conform to the Spark coding guidelines.
| | | * | | | | | | | Removed XYZFunctions and added XYZUtils as a common Scala and Java interface ↵Tathagata Das2014-01-079-17/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | for creating XYZ streams.
| | | * | | | | | | | Merge remote-tracking branch 'apache/master' into project-refactorTathagata Das2014-01-0647-66/+77
| | | |\ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: examples/src/main/java/org/apache/spark/streaming/examples/JavaFlumeEventCount.java streaming/src/main/scala/org/apache/spark/streaming/StreamingContext.scala streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaStreamingContext.scala streaming/src/test/java/org/apache/spark/streaming/JavaAPISuite.java streaming/src/test/scala/org/apache/spark/streaming/InputStreamsSuite.scala streaming/src/test/scala/org/apache/spark/streaming/TestSuiteBase.scala
| | | * | | | | | | | | Changed JavaStreamingContextWith*** to ***Function in streaming.api.java.*** ↵Tathagata Das2014-01-062-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | package. Also fixed packages of Flume and MQTT tests.
| | | * | | | | | | | | Refactored kafka, flume, zeromq, mqtt as separate external projects, with ↵Tathagata Das2013-12-306-13/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | their own self-contained scala API, java API, scala unit tests and java unit tests. Updated examples to use the external projects.
| | | * | | | | | | | | Refactored streaming project to separate out the twitter functionality.Tathagata Das2013-12-263-1/+4
| | | | | | | | | | | |
| | * | | | | | | | | | Merge pull request #318 from srowen/masterReynold Xin2014-01-0714-85/+135
| | |\ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Suggested small changes to Java code for slightly more standard style, encapsulation and in some cases performance Sorry if this is too abrupt or not a welcome set of changes, but thought I'd see if I could contribute a little. I'm a Java developer and just getting seriously into Spark. So I thought I'd suggest a number of small changes to the couple Java parts of the code to make it a little tighter, more standard and even a bit faster. Feel free to take all, some or none of this. Happy to explain any of it.
| | | * | | | | | | | | | Issue #318 : minor style updates per review from Reynold XinSean Owen2014-01-0710-33/+2
| | | | | | | | | | | | |
| | | * | | | | | | | | | Merge remote-tracking branch 'upstream/master'Sean Owen2014-01-0645-59/+63
| | | |\ \ \ \ \ \ \ \ \ \ | | | | | |/ / / / / / / / | | | | |/| | | | | | | |
| | | * | | | | | | | | | Suggested small changes to Java code for slightly more standard style, ↵Sean Owen2014-01-0214-83/+164
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | encapsulation and in some cases performance
| | * | | | | | | | | | | spark -> org.apache.sparkprabeesh2014-01-078-12/+12
| | | |/ / / / / / / / / | | |/| | | | | | | | |
| * | | | | | | | | | | Merge remote-tracking branch 'apache-github/master' into standalone-driverPatrick Wendell2014-01-0647-67/+78
| |\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: core/src/main/scala/org/apache/spark/deploy/client/AppClient.scala core/src/main/scala/org/apache/spark/deploy/client/TestClient.scala core/src/main/scala/org/apache/spark/deploy/master/Master.scala core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala core/src/main/scala/org/apache/spark/scheduler/cluster/SparkDeploySchedulerBackend.scala
| | * | | | | | | | | | Removing SPARK_EXAMPLES_JAR in the codePatrick Wendell2014-01-0544-44/+48
| | | | | | | | | | | |
| | * | | | | | | | | | run-example -> bin/run-examplePrashant Sharma2014-01-0210-15/+15
| | |/ / / / / / / / /
| | * | | | | | | | | Merge remote-tracking branch 'origin/master' into conf2Matei Zaharia2013-12-291-1/+1
| | |\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: core/src/main/scala/org/apache/spark/SparkContext.scala core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala core/src/main/scala/org/apache/spark/scheduler/cluster/ClusterTaskSetManager.scala core/src/main/scala/org/apache/spark/scheduler/local/LocalScheduler.scala core/src/main/scala/org/apache/spark/util/MetadataCleaner.scala core/src/test/scala/org/apache/spark/scheduler/TaskResultGetterSuite.scala core/src/test/scala/org/apache/spark/scheduler/TaskSetManagerSuite.scala new-yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala streaming/src/main/scala/org/apache/spark/streaming/Checkpoint.scala streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaStreamingContext.scala streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobGenerator.scala streaming/src/test/scala/org/apache/spark/streaming/BasicOperationsSuite.scala streaming/src/test/scala/org/apache/spark/streaming/CheckpointSuite.scala streaming/src/test/scala/org/apache/spark/streaming/InputStreamsSuite.scala streaming/src/test/scala/org/apache/spark/streaming/TestSuiteBase.scala streaming/src/test/scala/org/apache/spark/streaming/WindowOperationsSuite.scala
| | | * | | | | | | | Fixed job name in the java streaming example.azuryyu2013-12-241-1/+1
| | | | | | | | | | |
| | * | | | | | | | | Various fixes to configuration codeMatei Zaharia2013-12-282-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Got rid of global SparkContext.globalConf - Pass SparkConf to serializers and compression codecs - Made SparkConf public instead of private[spark] - Improved API of SparkContext and SparkConf - Switched executor environment vars to be passed through SparkConf - Fixed some places that were still using system properties - Fixed some tests, though others are still failing This still fails several tests in core, repl and streaming, likely due to properties not being set or cleared correctly (some of the tests run fine in isolation).
| | * | | | | | | | | spark-544, introducing SparkConf and related configuration overhaul.Prashant Sharma2013-12-253-7/+14
| | |/ / / / / / / /
* | | | | | | | | | Changed the way StreamingContext finds and reads checkpoint files, and added ↵Tathagata Das2014-01-093-10/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | JavaStreamingContext.getOrCreate.
* | | | | | | | | | Added StreamingContext.getOrCreate to for automatic recovery, and added ↵Tathagata Das2014-01-021-0/+58
|/ / / / / / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | RecoverableNetworkWordCount example to use it.
* | | | | | | | | Minor style clean-upPatrick Wendell2013-12-251-0/+1
| | | | | | | | |
* | | | | | | | | Adding better option parsingPatrick Wendell2013-12-251-0/+45
|/ / / / / / / /
* | | | | | | | Merge branch 'master' of github.com:apache/incubator-spark into scala-2.10-tempPrashant Sharma2013-11-216-16/+19
|\ \ \ \ \ \ \ \ | | |_|_|_|_|_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: core/src/main/scala/org/apache/spark/util/collection/PrimitiveVector.scala streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaStreamingContext.scala