aboutsummaryrefslogtreecommitdiff
path: root/mllib
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'master' of git://github.com/mesos/spark into scala-2.10Prashant Sharma2013-09-153-8/+331
|\ | | | | | | | | | | Conflicts: core/src/main/scala/org/apache/spark/SparkContext.scala project/SparkBuild.scala
| * Small tweaks to MLlib docsMatei Zaharia2013-09-082-8/+9
| |
| * respose to PR commentsAmeet Talwalkar2013-09-081-0/+322
| |
* | Merged with masterPrashant Sharma2013-09-0654-909/+3988
|\|
| * Add missing license headers found with RATMatei Zaharia2013-09-022-0/+34
| |
| * Move some classes to more appropriate packages:Matei Zaharia2013-09-0121-26/+43
| | | | | | | | | | | | * RDD, *RDDFunctions -> org.apache.spark.rdd * Utils, ClosureCleaner, SizeEstimator -> org.apache.spark.util * JavaSerializer, KryoSerializer -> org.apache.spark.serializer
| * Fix some URLsMatei Zaharia2013-09-011-1/+1
| |
| * Initial work to rename package to org.apache.sparkMatei Zaharia2013-09-0140-136/+136
| |
| * Fix broken build by removing addInterceptShivaram Venkataraman2013-08-302-11/+8
| |
| * Merge pull request #863 from shivaram/etrain-ridgeEvan Sparks2013-08-2912-297/+755
| |\ | | | | | | Adding linear regression and refactoring Ridge regression to use SGD
| | * Center & scale variables in Ridge, Lasso.Shivaram Venkataraman2013-08-2510-228/+347
| | | | | | | | | | | | | | | Also add a unit test that checks if ridge regression lowers cross-validation error.
| | * Fixing typos in Java tests, and addressing alignment issues.Evan Sparks2013-08-184-16/+16
| | |
| | * Centralizing linear data generator and mllib regression tests to use it.Evan Sparks2013-08-189-282/+84
| | |
| | * Adding Linear Regression, and refactoring Ridge Regression.Evan Sparks2013-08-188-176/+713
| | |
| * | Merge pull request #819 from shivaram/sgd-cleanupEvan Sparks2013-08-298-48/+160
| |\ \ | | | | | | | | Change SVM to use {0,1} labels
| | * | Add an option to turn off data validation, test it.Shivaram Venkataraman2013-08-255-18/+28
| | | | | | | | | | | | | | | | | | | | Also moves addIntercept to have default true to make it similar to validateData option
| | * | Specify label format in LogisticRegression.Shivaram Venkataraman2013-08-131-0/+6
| | | |
| | * | Fix SVM model and unit test to work with {0,1}.Shivaram Venkataraman2013-08-135-12/+18
| | | | | | | | | | | | | | | | Also rename validateFuncs to validators.
| | * | Change SVM to use {0,1} labels.Shivaram Venkataraman2013-08-137-26/+116
| | |/ | | | | | | | | | | | | Also add a data validation check to make sure classification labels are always 0 or 1 and add an appropriate test case.
| * | Fix code style and a nondeterministic RDD issue in ALSMatei Zaharia2013-08-221-11/+20
| | |
| * | Merge pull request #814 from holdenk/masterMatei Zaharia2013-08-221-7/+14
| |\ \ | | | | | | | | Create less instances of the random class during ALS initialization.
| | * | FixHolden Karau2013-08-151-2/+2
| | | |
| | * | Code review feedback :)Holden Karau2013-08-121-7/+7
| | | |
| | * | Use less instances of the random class during ALS setupHolden Karau2013-08-121-7/+14
| | |/
| * | Remove redundant dependencies from POMsJey Kottalam2013-08-181-4/+0
| | |
| * | Maven build now also works with YARNJey Kottalam2013-08-161-40/+0
| | |
| * | Don't mark hadoop-client as 'provided'Jey Kottalam2013-08-161-1/+0
| | |
| * | Maven build now works with CDH hadoop-2.0.0-mr1Jey Kottalam2013-08-161-27/+0
| | |
| * | Initial changes to make Maven build agnostic of hadoop versionJey Kottalam2013-08-161-33/+10
| |/
| * Merge pull request #812 from shivaram/maven-mllib-testsMatei Zaharia2013-08-126-7/+32
| |\ | | | | | | Create SparkContext in beforeAll for MLLib tests
| | * Create SparkContext in beforeAll for MLLib testsShivaram Venkataraman2013-08-116-7/+32
| | | | | | | | | | | | This overcomes test failures that occur using Maven
| * | Clean up scaladoc in ML Lib.Shivaram Venkataraman2013-08-1117-60/+171
| |/ | | | | | | | | Also build and copy ML Lib scaladoc in Spark docs build. Some more minor cleanup with respect to naming, test locations etc.
| * Merge pull request #762 from shivaram/sgd-cleanupEvan Sparks2013-08-1121-417/+816
| |\ | | | | | | Refactor SGD options into a new class.
| | * Fix GLM code review comments and move java testsShivaram Venkataraman2013-08-104-6/+2
| | |
| | * Add setters for optimizer, gradient in SGD.Shivaram Venkataraman2013-08-082-8/+19
| | | | | | | | | | | | Also remove java-specific constructor for LabeledPoint.
| | * Merge branch 'master' of git://github.com/mesos/spark into sgd-cleanupShivaram Venkataraman2013-08-066-6/+366
| | |\ | | | | | | | | | | | | | | | | Conflicts: mllib/src/main/scala/spark/mllib/util/MLUtils.scala
| | * | Refactor GLM algorithms and add Java testsShivaram Venkataraman2013-08-0620-169/+540
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This change adds Java examples and unit tests for all GLM algorithms to make sure the MLLib interface works from Java. Changes include - Introduce LabeledPoint and avoid using Doubles in train arguments - Rename train to run in class methods - Make the optimizer a member variable of GLM to make sure the builder pattern works
| | * | Move implicit arg to constructor for Java access.Shivaram Venkataraman2013-08-031-4/+7
| | | |
| | * | Refactor optimizers and create GLMsShivaram Venkataraman2013-08-0210-286/+320
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This change refactors the structure of GLMs to use mixins which maintain a similar interface to other ML lib algorithms. This change also creates an Optimizer trait which allows GLMs to be extended to use other optimization techniques.
| | * | Refactor SGD options into a new class.Shivaram Venkataraman2013-07-318-159/+143
| | | | | | | | | | | | | | | | | | | | This refactoring pulls out code shared between SVM, Lasso, LR into a common GradientDescentOpts class. Some style cleanup as well
| * | | Merge pull request #786 from shivaram/mllib-javaMatei Zaharia2013-08-096-30/+285
| |\ \ \ | | | | | | | | | | Java fixes, tests and examples for ALS, KMeans
| | * | | Remove Java-specific constructor for Rating.Shivaram Venkataraman2013-08-083-12/+3
| | | | | | | | | | | | | | | | | | | | | | | | | The scala constructor works for native type java types. Modify examples to match this.
| | * | | Add a test case for random initialization.Shivaram Venkataraman2013-08-062-2/+13
| | | | | | | | | | | | | | | | | | | | Also workaround a bug where double[][] class cast fails
| | * | | Java examples, tests for KMeans and ALSShivaram Venkataraman2013-08-066-27/+280
| | | |/ | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Changes ALS to accept RDD[Rating] instead of (Int, Int, Double) making it easier to call from Java - Renames class methods from `train` to `run` to enable static methods to be called from Java. - Add unit tests which check if both static / class methods can be called. - Also add examples which port the main() function in ALS, KMeans to the examples project. Couple of minor changes to existing code: - Add a toJavaRDD method in RDD to convert scala RDD to java RDD easily - Workaround a bug where using double[] from Java leads to class cast exception in KMeans init
| * / | Fixed a typo in mllib inline documentation.Reynold Xin2013-08-081-1/+1
| |/ /
| * | fixing formatting, style, and inputGinger Smith2013-08-051-36/+37
| | |
| * | fixing formattingGinger Smith2013-08-051-16/+23
| | |
| * | adding matrix factorization data generatorGinger Smith2013-08-021-0/+105
| | |
| * | Increase Kryo buffer size in ALS since some arrays become bigMatei Zaharia2013-08-021-0/+1
| | |
| * | Merge pull request #761 from mateiz/kmeans-generatorshivaram2013-07-312-4/+85
| |\ \ | | | | | | | | Add data generator for K-means