spark - Mirror of Apache Spark

	Commit message (Collapse)	Author	Age	Files	Lines
*	Fixed streaming examples docs to use run-example instead of spark-submit	Tathagata Das	2014-05-14	4	-17/+21
\| \| \| \| \| \| \| \| \| \| \| \| \|	Pretty self-explanatory Author: Tathagata Das <tathagata.das1565@gmail.com> Closes #722 from tdas/example-fix and squashes the following commits: 7839979 [Tathagata Das] Minor changes. 0673441 [Tathagata Das] Fixed java docs of java streaming example e687123 [Tathagata Das] Fixed scala style errors. 9b8d112 [Tathagata Das] Fixed streaming examples docs to use run-example instead of spark-submit.
*	SPARK-1565, update examples to be used with spark-submit script.	Prashant Sharma	2014-05-08	14	-128/+127
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit for initial feedback, basically I am curious if we should prompt user for providing args esp. when its mandatory. And can we skip if they are not ? Also few other things that did not work like `bin/spark-submit examples/target/scala-2.10/spark-examples-1.0.0-SNAPSHOT-hadoop1.0.4.jar --class org.apache.spark.examples.SparkALS --arg 100 500 10 5 2` Not all the args get passed properly, may be I have messed up something will try to sort it out hopefully. Author: Prashant Sharma <prashant.s@imaginea.com> Closes #552 from ScrapCodes/SPARK-1565/update-examples and squashes the following commits: 669dd23 [Prashant Sharma] Review comments 2727e70 [Prashant Sharma] SPARK-1565, update examples to be used with spark-submit script. (cherry picked from commit 44dd57fb66bb676d753ad8d9757f9f4c03364113) Signed-off-by: Patrick Wendell <pwendell@gmail.com>
*	[HOTFIX] SPARK-1637: There are some Streaming examples added after the PR ↵	Sandeep	2014-05-06	1	-4/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	#571 was last updated. This resulted in Compilation Errors. cc @mateiz project not compiling currently. Author: Sandeep <sandeep@techaddict.me> Closes #673 from techaddict/SPARK-1637-HOTFIX and squashes the following commits: b512f4f [Sandeep] [SPARK-1637][HOTFIX] There are some Streaming examples added after the PR #571 was last updated. This resulted in Compilation Errors. (cherry picked from commit fdae095de2daa1fc3b343c05e515235756d856a4) Signed-off-by: Patrick Wendell <pwendell@gmail.com>
*	SPARK-1637: Clean up examples for 1.0	Sandeep	2014-05-06	7	-9/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- [x] Move all of them into subpackages of org.apache.spark.examples (right now some are in org.apache.spark.streaming.examples, for instance, and others are in org.apache.spark.examples.mllib) - [x] Move Python examples into examples/src/main/python - [x] Update docs to reflect these changes Author: Sandeep <sandeep@techaddict.me> This patch had conflicts when merged, resolved by Committer: Matei Zaharia <matei@databricks.com> Closes #571 from techaddict/SPARK-1637 and squashes the following commits: 47ef86c [Sandeep] Changes based on Discussions on PR, removing use of RawTextHelper from examples 8ed2d3f [Sandeep] Docs Updated for changes, Change for java examples 5f96121 [Sandeep] Move Python examples into examples/src/main/python 0a8dd77 [Sandeep] Move all Scala Examples to org.apache.spark.examples (some are in org.apache.spark.streaming.examples, for instance, and others are in org.apache.spark.examples.mllib) (cherry picked from commit a000b5c3b0438c17e9973df4832c320210c29c27) Signed-off-by: Matei Zaharia <matei@databricks.com>
*	[SPARK-1504], [SPARK-1505], [SPARK-1558] Updated Spark Streaming guide	Tathagata Das	2014-05-05	4	-5/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- SPARK-1558: Updated custom receiver guide to match it with the new API - SPARK-1504: Added deployment and monitoring subsection to streaming - SPARK-1505: Added migration guide for migrating from 0.9.x and below to Spark 1.0 - Updated various Java streaming examples to use JavaReceiverInputDStream to highlight the API change. - Removed the requirement for cleaner ttl from streaming guide Author: Tathagata Das <tathagata.das1565@gmail.com> Closes #652 from tdas/doc-fix and squashes the following commits: cb4f4b7 [Tathagata Das] Possible fix for flaky graceful shutdown test. ab71f7f [Tathagata Das] Merge remote-tracking branch 'apache-github/master' into doc-fix 8d6ff9b [Tathagata Das] Addded migration guide to Spark Streaming. 7d171df [Tathagata Das] Added reference to JavaReceiverInputStream in examples and streaming guide. 49edd7c [Tathagata Das] Change java doc links to use Java docs. 11528d7 [Tathagata Das] Updated links on index page. ff80970 [Tathagata Das] More updates to streaming guide. 4dc42e9 [Tathagata Das] Added monitoring and other documentation in the streaming guide. 14c6564 [Tathagata Das] Updated custom receiver guide. (cherry picked from commit a975a19f21e71f448b3fdb2ed4461e28ef439900) Signed-off-by: Tathagata Das <tathagata.das1565@gmail.com>
*	Handle the vals that never used	WangTao	2014-04-29	3	-3/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In XORShiftRandom.scala, use val "million" instead of constant "1e6.toInt". Delete vals that never used in other files. Author: WangTao <barneystinson@aliyun.com> Closes #565 from WangTaoTheTonic/master and squashes the following commits: 17cacfc [WangTao] Handle the unused assignment, method parameters and symbol inspected by Intellij IDEA 37b4090 [WangTao] Handle the vals that never used (cherry picked from commit 7025dda8fa84b57d6f12bc770df2fa10eef21d88) Signed-off-by: Reynold Xin <rxin@apache.org>
*	[SPARK-1633][Streaming] Java API unit test and example for custom streaming ↵	Tathagata Das	2014-04-28	2	-4/+153
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	receiver in Java Author: Tathagata Das <tathagata.das1565@gmail.com> Closes #558 from tdas/more-fixes and squashes the following commits: c0c84e6 [Tathagata Das] Removing extra println() d8a8cf4 [Tathagata Das] More tweaks to make unit test work in Jenkins. b7caa98 [Tathagata Das] More tweaks. d337367 [Tathagata Das] More tweaks 22d6f2d [Tathagata Das] Merge remote-tracking branch 'apache/master' into more-fixes 40a961b [Tathagata Das] Modified java test to reduce flakiness. 9410ca6 [Tathagata Das] Merge remote-tracking branch 'apache/master' into more-fixes 86d9147 [Tathagata Das] scala style fix 2f3d7b1 [Tathagata Das] Added Scala custom receiver example. d677611 [Tathagata Das] Merge remote-tracking branch 'apache/master' into more-fixes bec3fc2 [Tathagata Das] Added license. 51d6514 [Tathagata Das] Fixed docs on receiver. 81aafa0 [Tathagata Das] Added Java test for Receiver API, and added JavaCustomReceiver example. (cherry picked from commit 1d84964bf80f4e69e54d62286c3861c2362342d0) Signed-off-by: Tathagata Das <tathagata.das1565@gmail.com>
*	[SQL] SPARK-1571 Mistake in java example code	Michael Armbrust	2014-04-22	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \|	Author: Michael Armbrust <michael@databricks.com> Closes #496 from marmbrus/javaBeanBug and squashes the following commits: 644fedd [Michael Armbrust] Bean methods must be public. (cherry picked from commit 39f85e0322cfecefbc30e7d5a30356cfab1e9640) Signed-off-by: Reynold Xin <rxin@apache.org>
*	SPARK-1446: Spark examples should not do a System.exit	Sandeep	2014-04-10	8	-10/+10
\| \| \| \| \| \| \| \| \| \| \|	Spark examples should exit nice using SparkContext.stop() method, rather than System.exit System.exit can cause issues like in SPARK-1407 Author: Sandeep <sandeep@techaddict.me> Closes #370 from techaddict/1446 and squashes the following commits: e9234cf [Sandeep] SPARK-1446: Spark examples should not do a System.exit Spark examples should exit nice using SparkContext.stop() method, rather than System.exit System.exit can cause issues like in SPARK-1407
*	Spark 1271: Co-Group and Group-By should pass Iterable[X]	Holden Karau	2014-04-08	1	-8/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Author: Holden Karau <holden@pigscanfly.ca> Closes #242 from holdenk/spark-1320-cogroupandgroupshouldpassiterator and squashes the following commits: f289536 [Holden Karau] Fix bad merge, should have been Iterable rather than Iterator 77048f8 [Holden Karau] Fix merge up to master d3fe909 [Holden Karau] use toSeq instead 7a092a3 [Holden Karau] switch resultitr to resultiterable eb06216 [Holden Karau] maybe I should have had a coffee first. use correct import for guava iterables c5075aa [Holden Karau] If guava 14 had iterables 2d06e10 [Holden Karau] Fix Java 8 cogroup tests for the new API 11e730c [Holden Karau] Fix streaming tests 66b583d [Holden Karau] Fix the core test suite to compile 4ed579b [Holden Karau] Refactor from iterator to iterable d052c07 [Holden Karau] Python tests now pass with iterator pandas 3bcd81d [Holden Karau] Revert "Try and make pickling list iterators work" cd1e81c [Holden Karau] Try and make pickling list iterators work c60233a [Holden Karau] Start investigating moving to iterators for python API like the Java/Scala one. tl;dr: We will have to write our own iterator since the default one doesn't pickle well 88a5cef [Holden Karau] Fix cogroup test in JavaAPISuite for streaming a5ee714 [Holden Karau] oops, was checking wrong iterator e687f21 [Holden Karau] Fix groupbykey test in JavaAPISuite of streaming ec8cc3e [Holden Karau] Fix test issues\! 4b0eeb9 [Holden Karau] Switch cast in PairDStreamFunctions fa395c9 [Holden Karau] Revert "Add a join based on the problem in SVD" ec99e32 [Holden Karau] Revert "Revert this but for now put things in list pandas" b692868 [Holden Karau] Revert 7e533f7 [Holden Karau] Fix the bug 8a5153a [Holden Karau] Revert me, but we have some stuff to debug b4e86a9 [Holden Karau] Add a join based on the problem in SVD c4510e2 [Holden Karau] Revert this but for now put things in list pandas b4e0b1d [Holden Karau] Fix style issues 71e8b9f [Holden Karau] I really need to stop calling size on iterators, it is the path of sadness. b1ae51a [Holden Karau] Fix some of the types in the streaming JavaAPI suite. Probably still needs more work 37888ec [Holden Karau] core/tests now pass 249abde [Holden Karau] org.apache.spark.rdd.PairRDDFunctionsSuite passes 6698186 [Holden Karau] Revert "I think this might be a bad rabbit hole. Started work to make CoGroupedRDD use iterator and then went crazy" fe992fe [Holden Karau] hmmm try and fix up basic operation suite 172705c [Holden Karau] Fix Java API suite caafa63 [Holden Karau] I think this might be a bad rabbit hole. Started work to make CoGroupedRDD use iterator and then went crazy 88b3329 [Holden Karau] Fix groupbykey to actually give back an iterator 4991af6 [Holden Karau] Fix some tests be50246 [Holden Karau] Calling size on an iterator is not so good if we want to use it after 687ffbc [Holden Karau] This is the it compiles point of replacing Seq with Iterator and JList with JIterator in the groupby and cogroup signatures
*	[SQL] SPARK-1333 First draft of java API	Michael Armbrust	2014-04-03	1	-0/+99
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	WIP: Some work remains... * [x] Hive support * [x] Tests * [x] Update docs Feedback welcome! Author: Michael Armbrust <michael@databricks.com> Closes #248 from marmbrus/javaSchemaRDD and squashes the following commits: b393913 [Michael Armbrust] @srowen 's java style suggestions. f531eb1 [Michael Armbrust] Address matei's comments. 33a1b1a [Michael Armbrust] Ignore JavaHiveSuite. 822f626 [Michael Armbrust] improve docs. ab91750 [Michael Armbrust] Improve Java SQL API: * Change JavaRow => Row * Add support for querying RDDs of JavaBeans * Docs * Tests * Hive support 0b859c8 [Michael Armbrust] First draft of java API.
*	[SPARK-1212, Part II] Support sparse data in MLlib	Xiangrui Meng	2014-04-02	1	-10/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In PR https://github.com/apache/spark/pull/117, we added dense/sparse vector data model and updated KMeans to support sparse input. This PR is to replace all other `Array[Double]` usage by `Vector` in generalized linear models (GLMs) and Naive Bayes. Major changes: 1. `LabeledPoint` becomes `LabeledPoint(Double, Vector)`. 2. Methods that accept `RDD[Array[Double]]` now accept `RDD[Vector]`. We cannot support both in an elegant way because of type erasure. 3. Mark 'createModel' and 'predictPoint' protected because they are not for end users. 4. Add libSVMFile to MLContext. 5. NaiveBayes can accept arbitrary labels (introducing a breaking change to Python's `NaiveBayesModel`). 6. Gradient computation no longer creates temp vectors. 7. Column normalization and centering are removed from Lasso and Ridge because the operation will densify the data. Simple feature transformation can be done before training. TODO: 1. ~~Use axpy when possible.~~ 2. ~~Optimize Naive Bayes.~~ Author: Xiangrui Meng <meng@databricks.com> Closes #245 from mengxr/vector and squashes the following commits: eb6e793 [Xiangrui Meng] move libSVMFile to MLUtils and rename to loadLibSVMData c26c4fc [Xiangrui Meng] update DecisionTree to use RDD[Vector] 11999c7 [Xiangrui Meng] Merge branch 'master' into vector f7da54b [Xiangrui Meng] add minSplits to libSVMFile da25e24 [Xiangrui Meng] revert the change to default addIntercept because it might change the behavior of existing code without warning 493f26f [Xiangrui Meng] Merge branch 'master' into vector 7c1bc01 [Xiangrui Meng] add a TODO to NB b9b7ef7 [Xiangrui Meng] change default value of addIntercept to false b01df54 [Xiangrui Meng] allow to change or clear threshold in LR and SVM 4addc50 [Xiangrui Meng] merge master 4ca5b1b [Xiangrui Meng] remove normalization from Lasso and update tests f04fe8a [Xiangrui Meng] remove normalization from RidgeRegression and update tests d088552 [Xiangrui Meng] use static constructor for MLContext 6f59eed [Xiangrui Meng] update libSVMFile to determine number of features automatically 3432e84 [Xiangrui Meng] update NaiveBayes to support sparse data 0f8759b [Xiangrui Meng] minor updates to NB b11659c [Xiangrui Meng] style update 78c4671 [Xiangrui Meng] add libSVMFile to MLContext f0fe616 [Xiangrui Meng] add a test for sparse linear regression 44733e1 [Xiangrui Meng] use in-place gradient computation e981396 [Xiangrui Meng] use axpy in Updater db808a1 [Xiangrui Meng] update JavaLR example befa592 [Xiangrui Meng] passed scala/java tests 75c83a4 [Xiangrui Meng] passed test compile 1859701 [Xiangrui Meng] passed compile 834ada2 [Xiangrui Meng] optimized MLUtils.computeStats update some ml algorithms to use Vector (cont.) 135ab72 [Xiangrui Meng] merge glm 0e57aa4 [Xiangrui Meng] update Lasso and RidgeRegression to parse the weights correctly from GLM mark createModel protected mark predictPoint protected d7f629f [Xiangrui Meng] fix a bug in GLM when intercept is not used 3f346ba [Xiangrui Meng] update some ml algorithms to use Vector
*	[SPARK-1212] Adding sparse data support and update KMeans	Xiangrui Meng	2014-03-23	2	-149/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Continue our discussions from https://github.com/apache/incubator-spark/pull/575 This PR is WIP because it depends on a SNAPSHOT version of breeze. Per previous discussions and benchmarks, I switched to breeze for linear algebra operations. @dlwh and I made some improvements to breeze to keep its performance comparable to the bare-bone implementation, including norm computation and squared distance. This is why this PR needs to depend on a SNAPSHOT version of breeze. @fommil , please find the notice of using netlib-core in `NOTICE`. This is following Apache's instructions on appropriate labeling. I'm going to update this PR to include: 1. Fast distance computation: using `\\|a\\|_2^2 + \\|b\\|_2^2 - 2 a^T b` when it doesn't introduce too much numerical error. The squared norms are pre-computed. Otherwise, computing the distance between the center (dense) and a point (possibly sparse) always takes O(n) time. 2. Some numbers about the performance. 3. A released version of breeze. @dlwh, a minor release of breeze will help this PR get merged early. Do you mind sharing breeze's release plan? Thanks! Author: Xiangrui Meng <meng@databricks.com> Closes #117 from mengxr/sparse-kmeans and squashes the following commits: 67b368d [Xiangrui Meng] fix SparseVector.toArray 5eda0de [Xiangrui Meng] update NOTICE 67abe31 [Xiangrui Meng] move ArrayRDDs to mllib.rdd 1da1033 [Xiangrui Meng] remove dependency on commons-math3 and compute EPSILON directly 9bb1b31 [Xiangrui Meng] optimize SparseVector.toArray 226d2cd [Xiangrui Meng] update Java friendly methods in Vectors 238ba34 [Xiangrui Meng] add VectorRDDs with a converter from RDD[Array[Double]] b28ba2f [Xiangrui Meng] add toArray to Vector e69b10c [Xiangrui Meng] remove examples/JavaKMeans.java, which is replaced by mllib/examples/JavaKMeans.java 72bde33 [Xiangrui Meng] clean up code for distance computation 712cb88 [Xiangrui Meng] make Vectors.sparse Java friendly 27858e4 [Xiangrui Meng] update breeze version to 0.7 07c3cf2 [Xiangrui Meng] change Mahout to breeze in doc use a simple lower bound to avoid unnecessary distance computation 6f5cdde [Xiangrui Meng] fix a bug in filtering finished runs 42512f2 [Xiangrui Meng] Merge branch 'master' into sparse-kmeans d6e6c07 [Xiangrui Meng] add predict(RDD[Vector]) to KMeansModel 42b4e50 [Xiangrui Meng] line feed at the end a4ace73 [Xiangrui Meng] Merge branch 'fast-dist' into sparse-kmeans 3ed1a24 [Xiangrui Meng] add doc to BreezeVectorWithSquaredNorm 0107e19 [Xiangrui Meng] update NOTICE 87bc755 [Xiangrui Meng] tuned the KMeans code: changed some for loops to while, use view to avoid copying arrays 0ff8046 [Xiangrui Meng] update KMeans to use fastSquaredDistance f355411 [Xiangrui Meng] add BreezeVectorWithSquaredNorm case class ab74f67 [Xiangrui Meng] add fastSquaredDistance for KMeans 4e7d5ca [Xiangrui Meng] minor style update 07ffaf2 [Xiangrui Meng] add dense/sparse vector data models and conversions to/from breeze vectors use breeze to implement KMeans in order to support both dense and sparse data
*	Fix example bug: compile error	jianghan	2014-03-12	3	-5/+5
\| \| \| \| \| \| \| \|	Author: jianghan <jianghan@xiaomi.com> Closes #132 from pooorman/master and squashes the following commits: 54afbe0 [jianghan] Fix example bug: compile error
*	[java8API] SPARK-964 Investigate the potential for using JDK 8 lambda ↵	Prashant Sharma	2014-03-03	12	-19/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	expressions for the Java/Scala APIs Author: Prashant Sharma <prashant.s@imaginea.com> Author: Patrick Wendell <pwendell@gmail.com> Closes #17 from ScrapCodes/java8-lambdas and squashes the following commits: 95850e6 [Patrick Wendell] Some doc improvements and build changes to the Java 8 patch. 85a954e [Prashant Sharma] Nit. import orderings. 673f7ac [Prashant Sharma] Added support for -java-home as well 80a13e8 [Prashant Sharma] Used fake class tag syntax 26eb3f6 [Prashant Sharma] Patrick's comments on PR. 35d8d79 [Prashant Sharma] Specified java 8 building in the docs 31d4cd6 [Prashant Sharma] Maven build to support -Pjava8-tests flag. 4ab87d3 [Prashant Sharma] Review feedback on the pr c33dc2c [Prashant Sharma] SPARK-964, Java 8 API Support.
*	fixed job name and usage information for the JavaSparkPi example	Kevin Mader	2014-01-22	1	-2/+2
\|
*	Added StreamingContext.awaitTermination to streaming examples.	Tathagata Das	2014-01-20	4	-0/+4
\|
*	Updated java API docs for streaming, along with very minor changes in the ↵	Tathagata Das	2014-01-16	1	-2/+1
\| \| \| \|	code examples.
*	Merge remote-tracking branch 'apache/master' into driver-test	Tathagata Das	2014-01-10	4	-1/+9
\|\ \| \| \| \| \| \| \| \|	Conflicts: streaming/src/main/scala/org/apache/spark/streaming/DStreamGraph.scala
\| *	Minor clean-up	Patrick Wendell	2014-01-09	1	-1/+1
\| \|
\| *	Set default logging to WARN for Spark streaming examples.	Patrick Wendell	2014-01-09	4	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \|	This programatically sets the log level to WARN by default for streaming tests. If the user has already specified a log4j.properties file, the user's file will take precedence over this default.
* \|	Merge branch 'standalone-driver' into driver-test	Tathagata Das	2014-01-09	14	-107/+162
\|\\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Conflicts: core/src/main/scala/org/apache/spark/SparkContext.scala core/src/main/scala/org/apache/spark/deploy/worker/DriverRunner.scala examples/src/main/java/org/apache/spark/streaming/examples/JavaNetworkWordCount.java streaming/src/main/scala/org/apache/spark/streaming/Checkpoint.scala streaming/src/main/scala/org/apache/spark/streaming/StreamingContext.scala streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaStreamingContext.scala streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobGenerator.scala
\| *	Merge pull request #313 from tdas/project-refactor	Patrick Wendell	2014-01-07	2	-8/+9
\| \|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Refactored the streaming project to separate external libraries like Twitter, Kafka, Flume, etc. At a high level, these are the following changes. 1. All the external code was put in `SPARK_HOME/external/` as separate SBT projects and Maven modules. Their artifact names are `spark-streaming-twitter`, `spark-streaming-kafka`, etc. Both SparkBuild.scala and pom.xml files have been updated. References to external libraries and repositories have been removed from the settings of root and streaming projects/modules. 2. To avail the external functionality (say, creating a Twitter stream), the developer has to `import org.apache.spark.streaming.twitter._` . For Scala API, the developer has to call `TwitterUtils.createStream(streamingContext, ...)`. For the Java API, the developer has to call `TwitterUtils.createStream(javaStreamingContext, ...)`. 3. Each external project has its own scala and java unit tests. Note the unit tests of each external library use classes of the streaming unit tests (`TestSuiteBase`, `LocalJavaStreamingContext`, etc.). To enable this code sharing among test classes, `dependsOn(streaming % "compile->compile,test->test")` was used in the SparkBuild.scala . In the streaming/pom.xml, an additional `maven-jar-plugin` was necessary to capture this dependency (see comment inside the pom.xml for more information). 4. Jars of the external projects have been added to examples project but not to the assembly project. 5. In some files, imports have been rearrange to conform to the Spark coding guidelines.
\| \| *	Removed XYZFunctions and added XYZUtils as a common Scala and Java interface ↵	Tathagata Das	2014-01-07	2	-8/+6
\| \| \| \| \| \| \| \| \| \| \| \|	for creating XYZ streams.
\| \| *	Merge remote-tracking branch 'apache/master' into project-refactor	Tathagata Das	2014-01-06	14	-15/+19
\| \| \|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Conflicts: examples/src/main/java/org/apache/spark/streaming/examples/JavaFlumeEventCount.java streaming/src/main/scala/org/apache/spark/streaming/StreamingContext.scala streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaStreamingContext.scala streaming/src/test/java/org/apache/spark/streaming/JavaAPISuite.java streaming/src/test/scala/org/apache/spark/streaming/InputStreamsSuite.scala streaming/src/test/scala/org/apache/spark/streaming/TestSuiteBase.scala
\| \| * \|	Changed JavaStreamingContextWith* to Function in streaming.api.java.** ↵	Tathagata Das	2014-01-06	2	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	package. Also fixed packages of Flume and MQTT tests.
\| \| * \|	Refactored kafka, flume, zeromq, mqtt as separate external projects, with ↵	Tathagata Das	2013-12-30	2	-6/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	their own self-contained scala API, java API, scala unit tests and java unit tests. Updated examples to use the external projects.
\| * \| \|	Issue #318 : minor style updates per review from Reynold Xin	Sean Owen	2014-01-07	10	-33/+2
\| \| \| \|
\| * \| \|	Merge remote-tracking branch 'upstream/master'	Sean Owen	2014-01-06	14	-15/+19
\| \|\ \ \ \| \| \| \|/ \| \| \|/\|
\| \| * \|	Removing SPARK_EXAMPLES_JAR in the code	Patrick Wendell	2014-01-05	14	-14/+18
\| \| \| \|
\| \| * \|	run-example -> bin/run-example	Prashant Sharma	2014-01-02	1	-1/+1
\| \| \|/
\| * /	Suggested small changes to Java code for slightly more standard style, ↵	Sean Owen	2014-01-02	14	-83/+164
\| \|/ \| \| \| \| \| \|	encapsulation and in some cases performance
\| *	Fixed job name in the java streaming example.	azuryyu	2013-12-24	1	-1/+1
\| \|
* \|	Changed the way StreamingContext finds and reads checkpoint files, and added ↵	Tathagata Das	2014-01-09	1	-4/+3
\|/ \| \| \|	JavaStreamingContext.getOrCreate.
*	Merge branch 'master' into scala-2.10	Raymond Liu	2013-11-13	1	-0/+98
\|\
\| *	Upgrade Kafka 0.7.2 to Kafka 0.8.0-beta1 for Spark Streaming	jerryshao	2013-10-12	1	-0/+98
\| \|
* \|	fixed some warnings	Martin Weindel	2013-10-05	4	-5/+3
\|/
*	Initial work to rename package to org.apache.spark	Matei Zaharia	2013-09-01	13	-84/+84
\|
*	Merge pull request #762 from shivaram/sgd-cleanup	Evan Sparks	2013-08-11	1	-0/+85
\|\ \| \| \| \|	Refactor SGD options into a new class.
\| *	Add setters for optimizer, gradient in SGD.	Shivaram Venkataraman	2013-08-08	1	-1/+1
\| \| \| \| \| \| \| \|	Also remove java-specific constructor for LabeledPoint.
\| *	Merge branch 'master' of git://github.com/mesos/spark into sgd-cleanup	Shivaram Venkataraman	2013-08-06	1	-0/+116
\| \|\ \| \| \| \| \| \| \| \| \| \| \| \|	Conflicts: mllib/src/main/scala/spark/mllib/util/MLUtils.scala
\| * \|	Refactor GLM algorithms and add Java tests	Shivaram Venkataraman	2013-08-06	1	-0/+85
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change adds Java examples and unit tests for all GLM algorithms to make sure the MLLib interface works from Java. Changes include - Introduce LabeledPoint and avoid using Doubles in train arguments - Rename train to run in class methods - Make the optimizer a member variable of GLM to make sure the builder pattern works
* \| \|	Fixed path to JavaALS.java and JavaKMeans.java, fixed hadoop2-yarn profile	Alexander Pivovarov	2013-08-10	2	-0/+0
\| \| \|
* \| \|	Merge pull request #786 from shivaram/mllib-java	Matei Zaharia	2013-08-09	2	-0/+168
\|\ \ \ \| \| \| \| \| \| \| \|	Java fixes, tests and examples for ALS, KMeans
\| * \| \|	Remove Java-specific constructor for Rating.	Shivaram Venkataraman	2013-08-08	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The scala constructor works for native type java types. Modify examples to match this.
\| * \| \|	Java examples, tests for KMeans and ALS	Shivaram Venkataraman	2013-08-06	2	-0/+168
\| \| \|/ \| \|/\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- Changes ALS to accept RDD[Rating] instead of (Int, Int, Double) making it easier to call from Java - Renames class methods from `train` to `run` to enable static methods to be called from Java. - Add unit tests which check if both static / class methods can be called. - Also add examples which port the main() function in ALS, KMeans to the examples project. Couple of minor changes to existing code: - Add a toJavaRDD method in RDD to convert scala RDD to java RDD easily - Workaround a bug where using double[] from Java leads to class cast exception in KMeans init
* / \|	Optimize JavaPageRank to use reduceByKey instead of groupByKey	Matei Zaharia	2013-08-08	1	-9/+8
\|/ /
* \|	Got rid of unnecessary map function	stayhf	2013-08-06	1	-6/+2
\| \|
* \|	changes as reviewer requested	stayhf	2013-08-06	1	-10/+1
\| \|
* \|	Updated code with reviewer's suggestions	stayhf	2013-08-05	1	-47/+47
\| \|