[SPARK-7212] [MLLIB] Add sequence learning flag

Support mining of ordered frequent item sequences. Author: Feynman Liang <fliang@databricks.com> Closes #6997 from feynmanliang/fp-sequence and squashes the following commits: 7c14e15 [Feynman Liang] Improve scalatests with R code and Seq 0d3e4b6 [Feynman Liang] Fix python test ce987cb [Feynman Liang] Backwards compatibility aux constructor 34ef8f2 [Feynman Liang] Fix failing test due to reverse orderering f04bd50 [Feynman Liang] Naming, add ordered to FreqItemsets, test ordering using Seq 648d4d4 [Feynman Liang] Test case for frequent item sequences 252a36a [Feynman Liang] Add sequence learning flag
author: Feynman Liang <fliang@databricks.com> 2015-06-28 22:26:07 -0700
committer: Xiangrui Meng <meng@databricks.com> 2015-06-28 22:26:07 -0700
commit: 25f574eb9a3cb9b93b7d9194a8ec16e00ce2c036 (patch)
tree: 61d923175d27bee429b50288d0b48c3d800294d4 /python/pyspark
parent: 00a9d22bd6ef42c1e7d8dd936798b449bb3a9f67 (diff)
download: spark-25f574eb9a3cb9b93b7d9194a8ec16e00ce2c036.tar.gz
spark-25f574eb9a3cb9b93b7d9194a8ec16e00ce2c036.tar.bz2
spark-25f574eb9a3cb9b93b7d9194a8ec16e00ce2c036.zip
1 files changed, 2 insertions, 2 deletions
diff --git a/python/pyspark/mllib/fpm.py b/python/pyspark/mllib/fpm.py
index bdc4a132b1..b7f00d6006 100644
--- a/python/pyspark/mllib/fpm.py
+++ b/python/pyspark/mllib/fpm.py
@@ -39,8 +39,8 @@ class FPGrowthModel(JavaModelWrapper):
     >>> data = [["a", "b", "c"], ["a", "b", "d", "e"], ["a", "c", "e"], ["a", "c", "f"]]
     >>> rdd = sc.parallelize(data, 2)
     >>> model = FPGrowth.train(rdd, 0.6, 2)
-    >>> sorted(model.freqItemsets().collect())
-    [FreqItemset(items=[u'a'], freq=4), FreqItemset(items=[u'c'], freq=3), ...
+    >>> sorted(model.freqItemsets().collect(), key=lambda x: x.items)
+    [FreqItemset(items=[u'a'], freq=4), FreqItemset(items=[u'a', u'c'], freq=3), ...
     """
 
     def freqItemsets(self):
author	Feynman Liang <fliang@databricks.com>	2015-06-28 22:26:07 -0700
committer	Xiangrui Meng <meng@databricks.com>	2015-06-28 22:26:07 -0700
commit	25f574eb9a3cb9b93b7d9194a8ec16e00ce2c036 (patch)
tree	61d923175d27bee429b50288d0b48c3d800294d4 /python/pyspark
parent	00a9d22bd6ef42c1e7d8dd936798b449bb3a9f67 (diff)
download	spark-25f574eb9a3cb9b93b7d9194a8ec16e00ce2c036.tar.gz spark-25f574eb9a3cb9b93b7d9194a8ec16e00ce2c036.tar.bz2 spark-25f574eb9a3cb9b93b7d9194a8ec16e00ce2c036.zip