aboutsummaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* New minor editsAndrew Or2013-12-263-54/+49
|
* Minor cleanup for Scala styleAaron Davidson2013-12-263-55/+55
|
* Add toggle for ExternalAppendOnlyMap in Aggregator and CoGroupedRDDAndrew Or2013-12-263-24/+65
|
* Provide for cases when mergeCombiners is not specified in ExternalAppendOnlyMapAndrew Or2013-12-262-68/+121
|
* Refactor ExternalAppendOnlyMap to take in KVC instead of just KVAndrew Or2013-12-263-76/+78
|
* Working ExternalAppendOnlyMap for both CoGroupedRDDs and AggregatorAndrew Or2013-12-264-66/+61
|
* Working ExternalAppendOnlyMap for Aggregator, but not for CoGroupedRDDAndrew Or2013-12-264-21/+182
|
* Merge pull request #295 from markhamstra/JobProgressListenerNPEMatei Zaharia2013-12-261-6/+3
|\ | | | | | | Avoid a lump of coal (NPE) in JobProgressListener's stocking.
| * Avoid a lump of coal (NPE) in JobProgressListener's stocking.Mark Hamstra2013-12-251-6/+3
| |
* | Merge pull request #296 from witgo/masterMatei Zaharia2013-12-268-17/+22
|\ \ | | | | | | | | | Renamed ClusterScheduler to TaskSchedulerImpl for yarn and new-yarn package
| * | fix this import orderliguoqiang2013-12-262-2/+2
| | |
| * | Renamed ClusterScheduler to TaskSchedulerImpl for yarn and new-yarnliguoqiang2013-12-262-4/+2
| | |
| * | Renamed ClusterScheduler to TaskSchedulerImpl for yarn and new-yarnliguoqiang2013-12-268-15/+22
| |/
* | Merge pull request #283 from tmyklebu/masterMatei Zaharia2013-12-269-1/+830
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | Python bindings for mllib This pull request contains Python bindings for the regression, clustering, classification, and recommendation tools in mllib. For each 'train' frontend exposed, there is a Scala stub in PythonMLLibAPI.scala and a Python stub in mllib.py. The Python stub serialises the input RDD and any vector/matrix arguments into a mutually-understood format and calls the Scala stub. The Scala stub deserialises the RDD and the vector/matrix arguments, calls the appropriate 'train' function, serialises the resulting model, and returns the serialised model. ALSModel is slightly different since a MatrixFactorizationModel has RDDs inside. The Scala stub returns a handle to a Scala MatrixFactorizationModel; prediction is done by calling the Scala predict method. I have tested these bindings on an x86_64 machine running Linux. There is a risk that these bindings may fail on some choose-your-own-endian platform if Python's endian differs from java.nio.ByteBuffer's idea of the native byte order.
| * Remove commented code in __init__.py.Tor Myklebust2013-12-251-8/+0
| |
| * Fix copypasta in __init__.py. Don't import anything directly into ↵Tor Myklebust2013-12-251-26/+8
| | | | | | | | pyspark.mllib.
| * Initial weights in Scala are ones; do that too. Also fix some errors.Tor Myklebust2013-12-251-6/+6
| |
| * Scala stubs for updated Python bindings.Tor Myklebust2013-12-251-13/+13
| |
| * Split the mllib bindings into a whole bunch of modules and rename some things.Tor Myklebust2013-12-257-183/+409
| |
| * Remove useless line from test stub.Tor Myklebust2013-12-241-1/+0
| |
| * Python change for move of PythonMLLibAPI.Tor Myklebust2013-12-241-1/+1
| |
| * Move PythonMLLibAPI into its own package.Tor Myklebust2013-12-241-0/+1
| |
| * Fix error message ugliness.Tor Myklebust2013-12-241-2/+2
| |
| * Release JVM reference to the ALSModel when done.Tor Myklebust2013-12-221-2/+2
| |
| * Java stubs for ALSModel.Tor Myklebust2013-12-211-0/+34
| |
| * Python stubs for ALSModel.Tor Myklebust2013-12-212-8/+56
| |
| * Javadocs; also, declare some things private.Tor Myklebust2013-12-201-5/+26
| |
| * Un-semicolon mllib.py.Tor Myklebust2013-12-201-11/+11
| |
| * Change some docstrings and add some others.Tor Myklebust2013-12-201-1/+3
| |
| * Licence notice.Tor Myklebust2013-12-202-0/+34
| |
| * Whitespace.Tor Myklebust2013-12-201-1/+1
| |
| * Remove gigantic endian-specific test and exception tests.Tor Myklebust2013-12-201-38/+3
| |
| * Tests for the Python side of the mllib bindings.Tor Myklebust2013-12-201-52/+172
| |
| * Python stubs for classification and clustering.Tor Myklebust2013-12-202-16/+96
| |
| * Scala classification and clustering stubs; matrix serialization/deserialization.Tor Myklebust2013-12-201-3/+79
| |
| * Python side of python bindings for linear, Lasso, and ridge regressionTor Myklebust2013-12-192-15/+72
| |
| * Bindings for linear, Lasso, and ridge regression.Tor Myklebust2013-12-191-5/+37
| |
| * Un-semicolon PythonMLLibAPI.Tor Myklebust2013-12-191-27/+27
| |
| * Incorporate most of Josh's style suggestions. I don't want to deal with the ↵Tor Myklebust2013-12-192-98/+91
| | | | | | | | type and length checking errors until we've got at least one working stub that we're all happy with.
| * The rest of the Python side of those bindings.Tor Myklebust2013-12-193-2/+4
| |
| * First cut at python mllib bindings. Only LinearRegression is supported.Tor Myklebust2013-12-192-0/+165
| |
* | Merge pull request #290 from ash211/patch-3Matei Zaharia2013-12-251-1/+1
|\ \ | | | | | | | | | Typo: avaiable -> available
| * | Typo: avaiable -> availableAndrew Ash2013-12-241-1/+1
| | |
* | | Merge pull request #287 from azuryyu/masterReynold Xin2013-12-252-2/+4
|\ \ \ | | | | | | | | | | | | Fixed job name in the java streaming example.
| * | | Make App report interval configurable during 'run on Yarn'azuryyu2013-12-241-1/+3
| | | |
| * | | Fixed job name in the java streaming example.azuryyu2013-12-241-1/+1
| | | |
* | | | Merge pull request #127 from kayousterhout/consolidate_schedulersPatrick Wendell2013-12-2426-1518/+947
|\ \ \ \ | |_|/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Deduplicate Local and Cluster schedulers. The code in LocalScheduler/LocalTaskSetManager was nearly identical to the code in ClusterScheduler/ClusterTaskSetManager. The redundancy made making updating the schedulers unnecessarily painful and error- prone. This commit combines the two into a single TaskScheduler/ TaskSetManager. Unfortunately the diff makes this change look much more invasive than it is -- TaskScheduler.scala is only superficially changed (names updated, overrides removed) from the old ClusterScheduler.scala, and the same with TaskSetManager.scala. Thanks @rxin for suggesting this change!
| * | | Responded to Reynold's style commentsKay Ousterhout2013-12-243-6/+7
| | | |
| * | | Correctly merged in maxTaskFailures fixKay Ousterhout2013-12-224-5/+5
| | | |
| * | | Fix build error in testKay Ousterhout2013-12-211-1/+1
| | | |