aboutsummaryrefslogtreecommitdiff
path: root/mllib
Commit message (Collapse)AuthorAgeFilesLines
* Merge pull request #786 from shivaram/mllib-javaMatei Zaharia2013-08-096-30/+285
|\ | | | | Java fixes, tests and examples for ALS, KMeans
| * Remove Java-specific constructor for Rating.Shivaram Venkataraman2013-08-083-12/+3
| | | | | | | | | | The scala constructor works for native type java types. Modify examples to match this.
| * Add a test case for random initialization.Shivaram Venkataraman2013-08-062-2/+13
| | | | | | | | Also workaround a bug where double[][] class cast fails
| * Java examples, tests for KMeans and ALSShivaram Venkataraman2013-08-066-27/+280
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Changes ALS to accept RDD[Rating] instead of (Int, Int, Double) making it easier to call from Java - Renames class methods from `train` to `run` to enable static methods to be called from Java. - Add unit tests which check if both static / class methods can be called. - Also add examples which port the main() function in ALS, KMeans to the examples project. Couple of minor changes to existing code: - Add a toJavaRDD method in RDD to convert scala RDD to java RDD easily - Workaround a bug where using double[] from Java leads to class cast exception in KMeans init
* | Fixed a typo in mllib inline documentation.Reynold Xin2013-08-081-1/+1
|/
* fixing formatting, style, and inputGinger Smith2013-08-051-36/+37
|
* fixing formattingGinger Smith2013-08-051-16/+23
|
* adding matrix factorization data generatorGinger Smith2013-08-021-0/+105
|
* Increase Kryo buffer size in ALS since some arrays become bigMatei Zaharia2013-08-021-0/+1
|
* Merge pull request #761 from mateiz/kmeans-generatorshivaram2013-07-312-4/+85
|\ | | | | Add data generator for K-means
| * Turn on caching in KMeans.mainMatei Zaharia2013-07-311-1/+1
| |
| * Added data generator for K-meansMatei Zaharia2013-07-312-3/+84
| | | | | | | | Also made it possible to specify the number of runs in KMeans.main().
* | Merge pull request #753 from shivaram/glm-refactorMatei Zaharia2013-07-311-0/+165
|\ \ | | | | | | Build changes for ML lib
| * | Add bagel, mllib to SBT assembly.Shivaram Venkataraman2013-07-301-0/+165
| | | | | | | | | | | | Also add jblas dependency to mllib pom.xml
* | | Use the Char version of split() instead of the String one for efficiencyMatei Zaharia2013-07-311-2/+2
| |/ |/|
* | Minor style cleanup of mllib.Reynold Xin2013-07-305-35/+39
| |
* | Use a tigher bound in logistic regression unit test's prediction validation.Reynold Xin2013-07-301-3/+4
| |
* | Renamed Classification.scala to ClassificationModel.scala and ↵Reynold Xin2013-07-302-0/+0
|/ | | | Regression.scala to RegressionModel.scala
* made SimpleUpdater consistent with other updatersAmeet Talwalkar2013-07-291-1/+2
|
* Clarify how regVal is computed in Updater docsShivaram Venkataraman2013-07-291-8/+9
|
* Remove duplicate loss history and clarify why.Shivaram Venkataraman2013-07-293-13/+9
| | | | Also some minor style fixes.
* Style fixXinghao2013-07-292-2/+4
| | | | Lines shortened to < 100 characters
* Fix validatePrediction functions for Classification modelsXinghao2013-07-292-4/+2
| | | | | Classifiers return categorical (Int) values that should be compared directly
* Deleting extra LogisticRegressionGenerator and RidgeRegressionGeneratorXinghao2013-07-292-96/+0
|
* Fix rounding error in LogisticRegression.scalaXinghao2013-07-291-2/+4
|
* Replace map-reduce with dot operator using DoubleMatrixXinghao2013-07-284-8/+18
|
* Fixed SVM and LR train functions to take Int instead of Double for ↵Xinghao2013-07-283-22/+21
| | | | Classification
* Changed Classification to return Int instead of DoubleXinghao2013-07-287-30/+28
| | | | Also minor changes to formatting and comments
* SVMSuite and LassoSuite rewritten to follow closely with LogisticRegressionSuiteXinghao2013-07-282-35/+161
|
* Move data generators to utilXinghao2013-07-282-0/+0
|
* Change *_LocalRandomSGD to *LocalRandomSGDXinghao2013-07-286-41/+24
|
* Resolve conflicts with master, removed regParam for LogisticRegressionXinghao2013-07-266-64/+412
|
* New files from merge with masterXinghao2013-07-2615-7/+399
|\
| * Use a different validation dataset for Logistic Regression prediction testing.Reynold Xin2013-07-231-12/+17
| |
| * Made RegressionModel serializable and added unit tests to make sure predict ↵Reynold Xin2013-07-236-16/+42
| | | | | | | | methods would work.
| * Merge pull request #711 from shivaram/ml-generatorsMatei Zaharia2013-07-192-39/+75
| |\ | | | | | | Move ML lib data generator files to util/
| | * Rename classes to be called DataGeneratorShivaram Venkataraman2013-07-182-3/+2
| | |
| | * Refactor data generators to have a function that can be used in tests.Shivaram Venkataraman2013-07-182-34/+71
| | |
| | * Move ML lib data generator files to util/Shivaram Venkataraman2013-07-172-2/+2
| | |
| * | Return Array[Double] from SGD instead of DoubleMatrixShivaram Venkataraman2013-07-172-6/+4
| | |
| * | Change weights to be Array[Double] in LR model.Shivaram Venkataraman2013-07-173-11/+15
| | | | | | | | | | | | Also ensure weights are initialized to a column vector.
| * | Rename loss -> stochasticLoss and add a note to explain why we haveShivaram Venkataraman2013-07-173-8/+13
| | | | | | | | | | | | multiple train methods.
| * | Allow initial weight vectors in LogisticRegression.Shivaram Venkataraman2013-07-175-32/+106
| |/ | | | | | | | | Also move LogisticGradient to the LogisticRegression file and fix the unit tests log path.
| * Add Apache license headers and LICENSE and NOTICE filesMatei Zaharia2013-07-1619-1/+324
| |
* | Making ClassificationModel serializableXinghao2013-07-261-1/+1
| |
* | Rename LogisticRegression, SVM and Lasso to *_LocalRandomSGDXinghao2013-07-266-18/+18
| |
* | Multiple changesXinghao2013-07-264-8/+9
| | | | | | | | | | | | | | | | - Changed LogisticRegression regularization parameter to 0 - Removed println from SVM predict function - Fixed "Lasso" -> "SVM" in SVMGenerator - Added comment in Updater.scala to indicate L1 regularization leads to soft thresholding proximal function
* | Adding SVM and Lasso, moving LogisticRegression to classification from ↵Xinghao2013-07-2413-18/+642
|/ | | | | | regression Also, add regularization parameter to SGD
* Shuffle ratings in a more efficient way at start of ALSMatei Zaharia2013-07-151-4/+14
|
* Make number of blocks in ALS configurable and lower the defaultMatei Zaharia2013-07-151-4/+5
|