aboutsummaryrefslogtreecommitdiff
path: root/mllib
Commit message (Collapse)AuthorAgeFilesLines
* Merge pull request #414 from soulmachine/code-styleReynold Xin2014-01-1516-48/+28
|\ | | | | | | | | | | | | | | | | Code clean up for mllib * Removed unnecessary parentheses * Removed unused imports * Simplified `filter...size()` to `count ...` * Removed obsoleted parameters' comments
| * Added parentheses for that getDouble() also has side effectFrank Dai2014-01-141-1/+1
| |
| * Merge remote-tracking branch 'upstream/master' into code-styleFrank Dai2014-01-1410-12/+168
| |\
| * | Indent two spacesFrank Dai2014-01-144-6/+6
| | |
| * | Since getLong() and getInt() have side effect, get back parentheses, and ↵Frank Dai2014-01-142-10/+9
| | | | | | | | | | | | remove an empty line
| * | Code clean up for mllibFrank Dai2014-01-1416-63/+44
| | |
* | | Add missing header filesPatrick Wendell2014-01-141-0/+17
| |/ |/|
* | Merge pull request #380 from mateiz/py-bayesPatrick Wendell2014-01-1310-13/+169
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | Add Naive Bayes to Python MLlib, and some API fixes - Added a Python wrapper for Naive Bayes - Updated the Scala Naive Bayes to match the style of our other algorithms better and in particular make it easier to call from Java (added builder pattern, removed default value in train method) - Updated Python MLlib functions to not require a SparkContext; we can get that from the RDD the user gives - Added a toString method in LabeledPoint - Made the Python MLlib tests run as part of run-tests as well (before they could only be run individually through each file)
| * Added Java unit test, data, and main method for Naive BayesMatei Zaharia2014-01-118-4/+111
| | | | | | | | Also fixes mains of a few other algorithms to print the final model
| * Add Naive Bayes to Python MLlib, and some API fixesMatei Zaharia2014-01-113-9/+58
| | | | | | | | | | | | | | | | | | | | | | | | - Added a Python wrapper for Naive Bayes - Updated the Scala Naive Bayes to match the style of our other algorithms better and in particular make it easier to call from Java (added builder pattern, removed default value in train method) - Updated Python MLlib functions to not require a SparkContext; we can get that from the RDD the user gives - Added a toString method in LabeledPoint - Made the Python MLlib tests run as part of run-tests as well (before they could only be run individually through each file)
* | Merge branch 'master' into remove_simpleredundantreturn_scalaHenry Saputra2014-01-121-7/+8
|\|
| * Fix configure didn't work small problem in ALSjerryshao2014-01-111-7/+8
| |
* | Remove simple redundant return statement for Scala methods/functions:Henry Saputra2014-01-121-15/+14
|/ | | | | | -) Only change simple return statements at the end of method -) Ignore the complex if-else check -) Ignore the ones inside synchronized
* Merge branch 'master' into MatrixFactorizationModel-fixHossein Falaki2014-01-074-2/+345
|\
| * Added GradientDescentSuiteXusen Yin2014-01-061-0/+116
| |
| * fix logistic loss bugXusen Yin2014-01-061-2/+2
| |
| * Merge pull request #292 from soulmachine/naive-bayesReynold Xin2014-01-042-0/+227
| |\ | | | | | | | | | | | | | | | standard Naive Bayes classifier Has implemented the standard Naive Bayes classifier. This is an updated version of #288, which is closed because of misoperations.
| | * Aggregated all sample points to driver without any shuffleLian, Cheng2014-01-022-53/+31
| | |
| | * Response to comments from Reynold, Ameet and EvanLian, Cheng2013-12-302-62/+90
| | | | | | | | | | | | | | | | | | | | | * Arguments renamed according to Ameet's suggestion * Using DoubleMatrix instead of Array[Double] in computation * Removed arguments C (kinds of label) and D (dimension of feature vector) from NaiveBayes.train() * Replaced reduceByKey with foldByKey to avoid modifying original input data
| | * Response to Reynold's commentsLian, Cheng2013-12-291-10/+16
| | |
| | * Added Apache license header to NaiveBayesSuiteLian, Cheng2013-12-271-0/+17
| | |
| | * Reformatted some lines commented by MateiLian, Cheng2013-12-271-2/+3
| | |
| | * Let reduceByKey to take care of local combineLian, Cheng2013-12-251-27/+16
| | | | | | | | | | | | Also refactored some heavy FP code to improve readability and reduce memory footprint.
| | * Refactored NaiveBayesLian, Cheng2013-12-252-28/+41
| | | | | | | | | | | | | | | * Minimized shuffle output with mapPartitions. * Reduced RDD actions from 3 to 1.
| | * standard Naive Bayes classifierFrank Dai2013-12-252-0/+195
| | |
* | | Added Rating deserializerHossein Falaki2014-01-061-1/+8
| | |
* | | Added serializing method for Rating objectHossein Falaki2014-01-061-4/+16
| | |
* | | Added python binding for bulk recommendationHossein Falaki2014-01-042-1/+27
| | |
* | | Removed unnecessary blank lineHossein Falaki2014-01-031-1/+0
| | |
* | | Added unit tests for bulk prediction in MatrixFactorizationModelHossein Falaki2014-01-031-2/+31
| | |
* | | Added a method to enable bulk predictionHossein Falaki2014-01-031-1/+23
|/ /
* | Merge remote-tracking branch 'origin/master' into conf2Matei Zaharia2013-12-291-0/+232
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: core/src/main/scala/org/apache/spark/SparkContext.scala core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala core/src/main/scala/org/apache/spark/scheduler/cluster/ClusterTaskSetManager.scala core/src/main/scala/org/apache/spark/scheduler/local/LocalScheduler.scala core/src/main/scala/org/apache/spark/util/MetadataCleaner.scala core/src/test/scala/org/apache/spark/scheduler/TaskResultGetterSuite.scala core/src/test/scala/org/apache/spark/scheduler/TaskSetManagerSuite.scala new-yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala streaming/src/main/scala/org/apache/spark/streaming/Checkpoint.scala streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaStreamingContext.scala streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobGenerator.scala streaming/src/test/scala/org/apache/spark/streaming/BasicOperationsSuite.scala streaming/src/test/scala/org/apache/spark/streaming/CheckpointSuite.scala streaming/src/test/scala/org/apache/spark/streaming/InputStreamsSuite.scala streaming/src/test/scala/org/apache/spark/streaming/TestSuiteBase.scala streaming/src/test/scala/org/apache/spark/streaming/WindowOperationsSuite.scala
| * | Scala stubs for updated Python bindings.Tor Myklebust2013-12-251-13/+13
| | |
| * | Move PythonMLLibAPI into its own package.Tor Myklebust2013-12-241-0/+1
| | |
| * | Fix error message ugliness.Tor Myklebust2013-12-241-2/+2
| | |
| * | Java stubs for ALSModel.Tor Myklebust2013-12-211-0/+34
| | |
| * | Javadocs; also, declare some things private.Tor Myklebust2013-12-201-5/+26
| | |
| * | Licence notice.Tor Myklebust2013-12-201-0/+17
| | |
| * | Scala classification and clustering stubs; matrix serialization/deserialization.Tor Myklebust2013-12-201-3/+79
| | |
| * | Bindings for linear, Lasso, and ridge regression.Tor Myklebust2013-12-191-5/+37
| | |
| * | Un-semicolon PythonMLLibAPI.Tor Myklebust2013-12-191-27/+27
| | |
| * | First cut at python mllib bindings. Only LinearRegression is supported.Tor Myklebust2013-12-191-0/+51
| |/
* | Various fixes to configuration codeMatei Zaharia2013-12-281-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | - Got rid of global SparkContext.globalConf - Pass SparkConf to serializers and compression codecs - Made SparkConf public instead of private[spark] - Improved API of SparkContext and SparkConf - Switched executor environment vars to be passed through SparkConf - Fixed some places that were still using system properties - Fixed some tests, though others are still failing This still fails several tests in core, repl and streaming, likely due to properties not being set or cleared correctly (some of the tests run fine in isolation).
* | spark-544, introducing SparkConf and related configuration overhaul.Prashant Sharma2013-12-251-7/+6
|/
* Use scala.binary.version in POMsMark Hamstra2013-12-151-5/+5
|
* Style fixes and addressed review comments at #221Prashant Sharma2013-12-101-5/+5
|
* Incorporated Patrick's feedback comment on #211 and made maven ↵Prashant Sharma2013-12-071-1/+1
| | | | build/dep-resolution atleast a bit faster.
* Merge branch 'master' into scala-2.10-wipPrashant Sharma2013-11-251-5/+6
|\ | | | | | | | | | | Conflicts: core/src/main/scala/org/apache/spark/rdd/RDD.scala project/SparkBuild.scala
| * Make XORShiftRandom explicit in KMeans and roll it back for RDDMarek Kolodziej2013-11-201-4/+4
| |
| * Updates to reflect pull request code reviewMarek Kolodziej2013-11-181-2/+3
| |