aboutsummaryrefslogtreecommitdiff
path: root/mllib
Commit message (Collapse)AuthorAgeFilesLines
* Fix broken build by removing addInterceptShivaram Venkataraman2013-08-302-11/+8
|
* Merge pull request #863 from shivaram/etrain-ridgeEvan Sparks2013-08-2912-297/+755
|\ | | | | Adding linear regression and refactoring Ridge regression to use SGD
| * Center & scale variables in Ridge, Lasso.Shivaram Venkataraman2013-08-2510-228/+347
| | | | | | | | | | Also add a unit test that checks if ridge regression lowers cross-validation error.
| * Fixing typos in Java tests, and addressing alignment issues.Evan Sparks2013-08-184-16/+16
| |
| * Centralizing linear data generator and mllib regression tests to use it.Evan Sparks2013-08-189-282/+84
| |
| * Adding Linear Regression, and refactoring Ridge Regression.Evan Sparks2013-08-188-176/+713
| |
* | Merge pull request #819 from shivaram/sgd-cleanupEvan Sparks2013-08-298-48/+160
|\ \ | | | | | | Change SVM to use {0,1} labels
| * | Add an option to turn off data validation, test it.Shivaram Venkataraman2013-08-255-18/+28
| | | | | | | | | | | | | | | Also moves addIntercept to have default true to make it similar to validateData option
| * | Specify label format in LogisticRegression.Shivaram Venkataraman2013-08-131-0/+6
| | |
| * | Fix SVM model and unit test to work with {0,1}.Shivaram Venkataraman2013-08-135-12/+18
| | | | | | | | | | | | Also rename validateFuncs to validators.
| * | Change SVM to use {0,1} labels.Shivaram Venkataraman2013-08-137-26/+116
| |/ | | | | | | | | Also add a data validation check to make sure classification labels are always 0 or 1 and add an appropriate test case.
* | Fix code style and a nondeterministic RDD issue in ALSMatei Zaharia2013-08-221-11/+20
| |
* | Merge pull request #814 from holdenk/masterMatei Zaharia2013-08-221-7/+14
|\ \ | | | | | | Create less instances of the random class during ALS initialization.
| * | FixHolden Karau2013-08-151-2/+2
| | |
| * | Code review feedback :)Holden Karau2013-08-121-7/+7
| | |
| * | Use less instances of the random class during ALS setupHolden Karau2013-08-121-7/+14
| |/
* | Remove redundant dependencies from POMsJey Kottalam2013-08-181-4/+0
| |
* | Maven build now also works with YARNJey Kottalam2013-08-161-40/+0
| |
* | Don't mark hadoop-client as 'provided'Jey Kottalam2013-08-161-1/+0
| |
* | Maven build now works with CDH hadoop-2.0.0-mr1Jey Kottalam2013-08-161-27/+0
| |
* | Initial changes to make Maven build agnostic of hadoop versionJey Kottalam2013-08-161-33/+10
|/
* Merge pull request #812 from shivaram/maven-mllib-testsMatei Zaharia2013-08-126-7/+32
|\ | | | | Create SparkContext in beforeAll for MLLib tests
| * Create SparkContext in beforeAll for MLLib testsShivaram Venkataraman2013-08-116-7/+32
| | | | | | | | This overcomes test failures that occur using Maven
* | Clean up scaladoc in ML Lib.Shivaram Venkataraman2013-08-1117-60/+171
|/ | | | | Also build and copy ML Lib scaladoc in Spark docs build. Some more minor cleanup with respect to naming, test locations etc.
* Merge pull request #762 from shivaram/sgd-cleanupEvan Sparks2013-08-1121-417/+816
|\ | | | | Refactor SGD options into a new class.
| * Fix GLM code review comments and move java testsShivaram Venkataraman2013-08-104-6/+2
| |
| * Add setters for optimizer, gradient in SGD.Shivaram Venkataraman2013-08-082-8/+19
| | | | | | | | Also remove java-specific constructor for LabeledPoint.
| * Merge branch 'master' of git://github.com/mesos/spark into sgd-cleanupShivaram Venkataraman2013-08-066-6/+366
| |\ | | | | | | | | | | | | Conflicts: mllib/src/main/scala/spark/mllib/util/MLUtils.scala
| * | Refactor GLM algorithms and add Java testsShivaram Venkataraman2013-08-0620-169/+540
| | | | | | | | | | | | | | | | | | | | | | | | | | | This change adds Java examples and unit tests for all GLM algorithms to make sure the MLLib interface works from Java. Changes include - Introduce LabeledPoint and avoid using Doubles in train arguments - Rename train to run in class methods - Make the optimizer a member variable of GLM to make sure the builder pattern works
| * | Move implicit arg to constructor for Java access.Shivaram Venkataraman2013-08-031-4/+7
| | |
| * | Refactor optimizers and create GLMsShivaram Venkataraman2013-08-0210-286/+320
| | | | | | | | | | | | | | | | | | | | | This change refactors the structure of GLMs to use mixins which maintain a similar interface to other ML lib algorithms. This change also creates an Optimizer trait which allows GLMs to be extended to use other optimization techniques.
| * | Refactor SGD options into a new class.Shivaram Venkataraman2013-07-318-159/+143
| | | | | | | | | | | | | | | This refactoring pulls out code shared between SVM, Lasso, LR into a common GradientDescentOpts class. Some style cleanup as well
* | | Merge pull request #786 from shivaram/mllib-javaMatei Zaharia2013-08-096-30/+285
|\ \ \ | | | | | | | | Java fixes, tests and examples for ALS, KMeans
| * | | Remove Java-specific constructor for Rating.Shivaram Venkataraman2013-08-083-12/+3
| | | | | | | | | | | | | | | | | | | | The scala constructor works for native type java types. Modify examples to match this.
| * | | Add a test case for random initialization.Shivaram Venkataraman2013-08-062-2/+13
| | | | | | | | | | | | | | | | Also workaround a bug where double[][] class cast fails
| * | | Java examples, tests for KMeans and ALSShivaram Venkataraman2013-08-066-27/+280
| | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Changes ALS to accept RDD[Rating] instead of (Int, Int, Double) making it easier to call from Java - Renames class methods from `train` to `run` to enable static methods to be called from Java. - Add unit tests which check if both static / class methods can be called. - Also add examples which port the main() function in ALS, KMeans to the examples project. Couple of minor changes to existing code: - Add a toJavaRDD method in RDD to convert scala RDD to java RDD easily - Workaround a bug where using double[] from Java leads to class cast exception in KMeans init
* / | Fixed a typo in mllib inline documentation.Reynold Xin2013-08-081-1/+1
|/ /
* | fixing formatting, style, and inputGinger Smith2013-08-051-36/+37
| |
* | fixing formattingGinger Smith2013-08-051-16/+23
| |
* | adding matrix factorization data generatorGinger Smith2013-08-021-0/+105
| |
* | Increase Kryo buffer size in ALS since some arrays become bigMatei Zaharia2013-08-021-0/+1
| |
* | Merge pull request #761 from mateiz/kmeans-generatorshivaram2013-07-312-4/+85
|\ \ | | | | | | Add data generator for K-means
| * | Turn on caching in KMeans.mainMatei Zaharia2013-07-311-1/+1
| | |
| * | Added data generator for K-meansMatei Zaharia2013-07-312-3/+84
| |/ | | | | | | Also made it possible to specify the number of runs in KMeans.main().
* | Merge pull request #753 from shivaram/glm-refactorMatei Zaharia2013-07-311-0/+165
|\ \ | | | | | | Build changes for ML lib
| * | Add bagel, mllib to SBT assembly.Shivaram Venkataraman2013-07-301-0/+165
| | | | | | | | | | | | Also add jblas dependency to mllib pom.xml
* | | Use the Char version of split() instead of the String one for efficiencyMatei Zaharia2013-07-311-2/+2
| |/ |/|
* | Minor style cleanup of mllib.Reynold Xin2013-07-305-35/+39
| |
* | Use a tigher bound in logistic regression unit test's prediction validation.Reynold Xin2013-07-301-3/+4
| |
* | Renamed Classification.scala to ClassificationModel.scala and ↵Reynold Xin2013-07-302-0/+0
|/ | | | Regression.scala to RegressionModel.scala