aboutsummaryrefslogtreecommitdiff
path: root/docs/mllib-collaborative-filtering.md
diff options
context:
space:
mode:
authorSean Owen <sowen@cloudera.com>2014-07-13 19:27:43 -0700
committerXiangrui Meng <meng@databricks.com>2014-07-13 19:27:43 -0700
commit635888cbed0e3f4127252fb84db449f0cc9ed659 (patch)
tree43433e3393c889f25a8ef4898099664a1a5ce0a7 /docs/mllib-collaborative-filtering.md
parent4c8be64e768fe71643b37f1e82f619c8aeac6eff (diff)
downloadspark-635888cbed0e3f4127252fb84db449f0cc9ed659.tar.gz
spark-635888cbed0e3f4127252fb84db449f0cc9ed659.tar.bz2
spark-635888cbed0e3f4127252fb84db449f0cc9ed659.zip
SPARK-2363. Clean MLlib's sample data files
(Just made a PR for this, mengxr was the reporter of:) MLlib has sample data under serveral folders: 1) data/mllib 2) data/ 3) mllib/data/* Per previous discussion with Matei Zaharia, we want to put them under `data/mllib` and clean outdated files. Author: Sean Owen <sowen@cloudera.com> Closes #1394 from srowen/SPARK-2363 and squashes the following commits: 54313dd [Sean Owen] Move ML example data from /mllib/data/ and /data/ into /data/mllib/
Diffstat (limited to 'docs/mllib-collaborative-filtering.md')
-rw-r--r--docs/mllib-collaborative-filtering.md4
1 files changed, 2 insertions, 2 deletions
diff --git a/docs/mllib-collaborative-filtering.md b/docs/mllib-collaborative-filtering.md
index d51002f015..5cd7173872 100644
--- a/docs/mllib-collaborative-filtering.md
+++ b/docs/mllib-collaborative-filtering.md
@@ -58,7 +58,7 @@ import org.apache.spark.mllib.recommendation.ALS
import org.apache.spark.mllib.recommendation.Rating
// Load and parse the data
-val data = sc.textFile("mllib/data/als/test.data")
+val data = sc.textFile("data/mllib/als/test.data")
val ratings = data.map(_.split(',') match { case Array(user, item, rate) =>
Rating(user.toInt, item.toInt, rate.toDouble)
})
@@ -112,7 +112,7 @@ from pyspark.mllib.recommendation import ALS
from numpy import array
# Load and parse the data
-data = sc.textFile("mllib/data/als/test.data")
+data = sc.textFile("data/mllib/als/test.data")
ratings = data.map(lambda line: array([float(x) for x in line.split(',')]))
# Build the recommendation model using Alternating Least Squares