diff options
author | Burak <brkyvz@gmail.com> | 2014-07-21 17:03:40 -0700 |
---|---|---|
committer | Xiangrui Meng <meng@databricks.com> | 2014-07-21 17:03:40 -0700 |
commit | a4d60208ec7995146541451849c51670cdc56451 (patch) | |
tree | 97bb3b039136994ca210ade8f0436f1923a294b9 /examples/src/main/python | |
parent | abeacffb7bcdfa3eeb1e969aa546029a7b464eaa (diff) | |
download | spark-a4d60208ec7995146541451849c51670cdc56451.tar.gz spark-a4d60208ec7995146541451849c51670cdc56451.tar.bz2 spark-a4d60208ec7995146541451849c51670cdc56451.zip |
[SPARK-2434][MLlib]: Warning messages that point users to original MLlib implementations added to Examples
[SPARK-2434][MLlib]: Warning messages that refer users to the original MLlib implementations of some popular example machine learning algorithms added both in the comments and the code. The following examples have been modified:
Scala:
* LocalALS
* LocalFileLR
* LocalKMeans
* LocalLP
* SparkALS
* SparkHdfsLR
* SparkKMeans
* SparkLR
Python:
* kmeans.py
* als.py
* logistic_regression.py
Author: Burak <brkyvz@gmail.com>
Closes #1515 from brkyvz/SPARK-2434 and squashes the following commits:
7505da9 [Burak] [SPARK-2434][MLlib]: Warning messages added, scalastyle errors fixed, and added missing punctuation
b96b522 [Burak] [SPARK-2434][MLlib]: Warning messages added and scalastyle errors fixed
4762f39 [Burak] [SPARK-2434]: Warning messages added
17d3d83 [Burak] SPARK-2434: Added warning messages to the naive implementations of the example algorithms
2cb5301 [Burak] SPARK-2434: Warning messages redirecting to original implementaions added.
Diffstat (limited to 'examples/src/main/python')
-rwxr-xr-x | examples/src/main/python/als.py | 9 | ||||
-rwxr-xr-x | examples/src/main/python/kmeans.py | 6 | ||||
-rwxr-xr-x | examples/src/main/python/logistic_regression.py | 6 |
3 files changed, 21 insertions, 0 deletions
diff --git a/examples/src/main/python/als.py b/examples/src/main/python/als.py index 1a7c4c51f4..c862650b0a 100755 --- a/examples/src/main/python/als.py +++ b/examples/src/main/python/als.py @@ -16,6 +16,9 @@ # """ +This is an example implementation of ALS for learning how to use Spark. Please refer to +ALS in pyspark.mllib.recommendation for more conventional use. + This example requires numpy (http://www.numpy.org/) """ from os.path import realpath @@ -49,9 +52,15 @@ def update(i, vec, mat, ratings): if __name__ == "__main__": + """ Usage: als [M] [U] [F] [iterations] [slices]" """ + + print >> sys.stderr, """WARN: This is a naive implementation of ALS and is given as an + example. Please use the ALS method found in pyspark.mllib.recommendation for more + conventional use.""" + sc = SparkContext(appName="PythonALS") M = int(sys.argv[1]) if len(sys.argv) > 1 else 100 U = int(sys.argv[2]) if len(sys.argv) > 2 else 500 diff --git a/examples/src/main/python/kmeans.py b/examples/src/main/python/kmeans.py index 988fc45baf..036bdf4c4f 100755 --- a/examples/src/main/python/kmeans.py +++ b/examples/src/main/python/kmeans.py @@ -45,9 +45,15 @@ def closestPoint(p, centers): if __name__ == "__main__": + if len(sys.argv) != 4: print >> sys.stderr, "Usage: kmeans <file> <k> <convergeDist>" exit(-1) + + print >> sys.stderr, """WARN: This is a naive implementation of KMeans Clustering and is given + as an example! Please refer to examples/src/main/python/mllib/kmeans.py for an example on + how to use MLlib's KMeans implementation.""" + sc = SparkContext(appName="PythonKMeans") lines = sc.textFile(sys.argv[1]) data = lines.map(parseVector).cache() diff --git a/examples/src/main/python/logistic_regression.py b/examples/src/main/python/logistic_regression.py index 6c33deabfd..8456b272f9 100755 --- a/examples/src/main/python/logistic_regression.py +++ b/examples/src/main/python/logistic_regression.py @@ -47,9 +47,15 @@ def readPointBatch(iterator): return [matrix] if __name__ == "__main__": + if len(sys.argv) != 3: print >> sys.stderr, "Usage: logistic_regression <file> <iterations>" exit(-1) + + print >> sys.stderr, """WARN: This is a naive implementation of Logistic Regression and is + given as an example! Please refer to examples/src/main/python/mllib/logistic_regression.py + to see how MLlib's implementation is used.""" + sc = SparkContext(appName="PythonLR") points = sc.textFile(sys.argv[1]).mapPartitions(readPointBatch).cache() iterations = int(sys.argv[2]) |