spark - Mirror of Apache Spark

	Commit message (Collapse)	Author	Age	Files	Lines
*	Merge pull request #380 from mateiz/py-bayes	Patrick Wendell	2014-01-13	10	-13/+169
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add Naive Bayes to Python MLlib, and some API fixes - Added a Python wrapper for Naive Bayes - Updated the Scala Naive Bayes to match the style of our other algorithms better and in particular make it easier to call from Java (added builder pattern, removed default value in train method) - Updated Python MLlib functions to not require a SparkContext; we can get that from the RDD the user gives - Added a toString method in LabeledPoint - Made the Python MLlib tests run as part of run-tests as well (before they could only be run individually through each file)
\| *	Added Java unit test, data, and main method for Naive Bayes	Matei Zaharia	2014-01-11	8	-4/+111
\| \| \| \| \| \| \| \|	Also fixes mains of a few other algorithms to print the final model
\| *	Add Naive Bayes to Python MLlib, and some API fixes	Matei Zaharia	2014-01-11	3	-9/+58
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- Added a Python wrapper for Naive Bayes - Updated the Scala Naive Bayes to match the style of our other algorithms better and in particular make it easier to call from Java (added builder pattern, removed default value in train method) - Updated Python MLlib functions to not require a SparkContext; we can get that from the RDD the user gives - Added a toString method in LabeledPoint - Made the Python MLlib tests run as part of run-tests as well (before they could only be run individually through each file)
* \|	Merge branch 'master' into remove_simpleredundantreturn_scala	Henry Saputra	2014-01-12	1	-7/+8
\|\\|
\| *	Fix configure didn't work small problem in ALS	jerryshao	2014-01-11	1	-7/+8
\| \|
* \|	Remove simple redundant return statement for Scala methods/functions:	Henry Saputra	2014-01-12	1	-15/+14
\|/ \| \| \| \| \|	-) Only change simple return statements at the end of method -) Ignore the complex if-else check -) Ignore the ones inside synchronized
*	Merge branch 'master' into MatrixFactorizationModel-fix	Hossein Falaki	2014-01-07	4	-2/+345
\|\
\| *	Added GradientDescentSuite	Xusen Yin	2014-01-06	1	-0/+116
\| \|
\| *	fix logistic loss bug	Xusen Yin	2014-01-06	1	-2/+2
\| \|
\| *	Merge pull request #292 from soulmachine/naive-bayes	Reynold Xin	2014-01-04	2	-0/+227
\| \|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	standard Naive Bayes classifier Has implemented the standard Naive Bayes classifier. This is an updated version of #288, which is closed because of misoperations.
\| \| *	Aggregated all sample points to driver without any shuffle	Lian, Cheng	2014-01-02	2	-53/+31
\| \| \|
\| \| *	Response to comments from Reynold, Ameet and Evan	Lian, Cheng	2013-12-30	2	-62/+90
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Arguments renamed according to Ameet's suggestion * Using DoubleMatrix instead of Array[Double] in computation * Removed arguments C (kinds of label) and D (dimension of feature vector) from NaiveBayes.train() * Replaced reduceByKey with foldByKey to avoid modifying original input data
\| \| *	Response to Reynold's comments	Lian, Cheng	2013-12-29	1	-10/+16
\| \| \|
\| \| *	Added Apache license header to NaiveBayesSuite	Lian, Cheng	2013-12-27	1	-0/+17
\| \| \|
\| \| *	Reformatted some lines commented by Matei	Lian, Cheng	2013-12-27	1	-2/+3
\| \| \|
\| \| *	Let reduceByKey to take care of local combine	Lian, Cheng	2013-12-25	1	-27/+16
\| \| \| \| \| \| \| \| \| \| \| \|	Also refactored some heavy FP code to improve readability and reduce memory footprint.
\| \| *	Refactored NaiveBayes	Lian, Cheng	2013-12-25	2	-28/+41
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Minimized shuffle output with mapPartitions. * Reduced RDD actions from 3 to 1.
\| \| *	standard Naive Bayes classifier	Frank Dai	2013-12-25	2	-0/+195
\| \| \|
* \| \|	Added Rating deserializer	Hossein Falaki	2014-01-06	1	-1/+8
\| \| \|
* \| \|	Added serializing method for Rating object	Hossein Falaki	2014-01-06	1	-4/+16
\| \| \|
* \| \|	Added python binding for bulk recommendation	Hossein Falaki	2014-01-04	2	-1/+27
\| \| \|
* \| \|	Removed unnecessary blank line	Hossein Falaki	2014-01-03	1	-1/+0
\| \| \|
* \| \|	Added unit tests for bulk prediction in MatrixFactorizationModel	Hossein Falaki	2014-01-03	1	-2/+31
\| \| \|
* \| \|	Added a method to enable bulk prediction	Hossein Falaki	2014-01-03	1	-1/+23
\|/ /
* \|	Merge remote-tracking branch 'origin/master' into conf2	Matei Zaharia	2013-12-29	1	-0/+232
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Conflicts: core/src/main/scala/org/apache/spark/SparkContext.scala core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala core/src/main/scala/org/apache/spark/scheduler/cluster/ClusterTaskSetManager.scala core/src/main/scala/org/apache/spark/scheduler/local/LocalScheduler.scala core/src/main/scala/org/apache/spark/util/MetadataCleaner.scala core/src/test/scala/org/apache/spark/scheduler/TaskResultGetterSuite.scala core/src/test/scala/org/apache/spark/scheduler/TaskSetManagerSuite.scala new-yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala streaming/src/main/scala/org/apache/spark/streaming/Checkpoint.scala streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaStreamingContext.scala streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobGenerator.scala streaming/src/test/scala/org/apache/spark/streaming/BasicOperationsSuite.scala streaming/src/test/scala/org/apache/spark/streaming/CheckpointSuite.scala streaming/src/test/scala/org/apache/spark/streaming/InputStreamsSuite.scala streaming/src/test/scala/org/apache/spark/streaming/TestSuiteBase.scala streaming/src/test/scala/org/apache/spark/streaming/WindowOperationsSuite.scala
\| * \|	Scala stubs for updated Python bindings.	Tor Myklebust	2013-12-25	1	-13/+13
\| \| \|
\| * \|	Move PythonMLLibAPI into its own package.	Tor Myklebust	2013-12-24	1	-0/+1
\| \| \|
\| * \|	Fix error message ugliness.	Tor Myklebust	2013-12-24	1	-2/+2
\| \| \|
\| * \|	Java stubs for ALSModel.	Tor Myklebust	2013-12-21	1	-0/+34
\| \| \|
\| * \|	Javadocs; also, declare some things private.	Tor Myklebust	2013-12-20	1	-5/+26
\| \| \|
\| * \|	Licence notice.	Tor Myklebust	2013-12-20	1	-0/+17
\| \| \|
\| * \|	Scala classification and clustering stubs; matrix serialization/deserialization.	Tor Myklebust	2013-12-20	1	-3/+79
\| \| \|
\| * \|	Bindings for linear, Lasso, and ridge regression.	Tor Myklebust	2013-12-19	1	-5/+37
\| \| \|
\| * \|	Un-semicolon PythonMLLibAPI.	Tor Myklebust	2013-12-19	1	-27/+27
\| \| \|
\| * \|	First cut at python mllib bindings. Only LinearRegression is supported.	Tor Myklebust	2013-12-19	1	-0/+51
\| \|/
* \|	Various fixes to configuration code	Matei Zaharia	2013-12-28	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- Got rid of global SparkContext.globalConf - Pass SparkConf to serializers and compression codecs - Made SparkConf public instead of private[spark] - Improved API of SparkContext and SparkConf - Switched executor environment vars to be passed through SparkConf - Fixed some places that were still using system properties - Fixed some tests, though others are still failing This still fails several tests in core, repl and streaming, likely due to properties not being set or cleared correctly (some of the tests run fine in isolation).
* \|	spark-544, introducing SparkConf and related configuration overhaul.	Prashant Sharma	2013-12-25	1	-7/+6
\|/
*	Use scala.binary.version in POMs	Mark Hamstra	2013-12-15	1	-5/+5
\|
*	Style fixes and addressed review comments at #221	Prashant Sharma	2013-12-10	1	-5/+5
\|
*	Incorporated Patrick's feedback comment on #211 and made maven ↵	Prashant Sharma	2013-12-07	1	-1/+1
\| \| \| \|	build/dep-resolution atleast a bit faster.
*	Merge branch 'master' into scala-2.10-wip	Prashant Sharma	2013-11-25	1	-5/+6
\|\ \| \| \| \| \| \| \| \| \| \|	Conflicts: core/src/main/scala/org/apache/spark/rdd/RDD.scala project/SparkBuild.scala
\| *	Make XORShiftRandom explicit in KMeans and roll it back for RDD	Marek Kolodziej	2013-11-20	1	-4/+4
\| \|
\| *	Updates to reflect pull request code review	Marek Kolodziej	2013-11-18	1	-2/+3
\| \|
\| *	XORShift RNG with unit tests and benchmark	Marek Kolodziej	2013-11-18	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	To run unit test, start SBT console and type: compile test-only org.apache.spark.util.XORShiftRandomSuite To run benchmark, type: project core console Once the Scala console starts, type: org.apache.spark.util.XORShiftRandom.benchmark(100000000)
* \|	Merge branch 'master' of github.com:apache/incubator-spark into scala-2.10	Prashant Sharma	2013-10-10	3	-59/+300
\|\\|
\| *	Merge branch 'master' into implicit-als	Nick Pentreath	2013-10-07	1	-4/+4
\| \|\
\| * \|	Bumping up test matrix size to eliminate random failures	Nick Pentreath	2013-10-07	2	-12/+12
\| \| \|
\| * \|	Style fix using 'if' rather than 'match' on boolean	Nick Pentreath	2013-10-04	1	-14/+13
\| \| \|
\| * \|	Fixing closing brace indentation	Nick Pentreath	2013-10-04	1	-1/+1
\| \| \|
\| * \|	Reverting to using comma-delimited split	Nick Pentreath	2013-10-04	1	-1/+1
\| \| \|