aboutsummaryrefslogtreecommitdiff
path: root/data
diff options
context:
space:
mode:
authorZheng RuiFeng <ruifengz@foxmail.com>2016-05-11 12:49:41 +0200
committerNick Pentreath <nickp@za.ibm.com>2016-05-11 12:49:41 +0200
commitd88afabdfa83be47f36d833105aadd6b818ceeee (patch)
tree224f8a6fc5268e30bf200ab30f4e20f5c0164b34 /data
parentfafc95af79fa34f82964a86407c2ee046eda3814 (diff)
downloadspark-d88afabdfa83be47f36d833105aadd6b818ceeee.tar.gz
spark-d88afabdfa83be47f36d833105aadd6b818ceeee.tar.bz2
spark-d88afabdfa83be47f36d833105aadd6b818ceeee.zip
[SPARK-15150][EXAMPLE][DOC] Update LDA examples
## What changes were proposed in this pull request? 1,create a libsvm-type dataset for lda: `data/mllib/sample_lda_libsvm_data.txt` 2,add python example 3,directly read the datafile in examples 4,BTW, change to `SparkSession` in `aft_survival_regression.py` ## How was this patch tested? manual tests `./bin/spark-submit examples/src/main/python/ml/lda_example.py` Author: Zheng RuiFeng <ruifengz@foxmail.com> Closes #12927 from zhengruifeng/lda_pe.
Diffstat (limited to 'data')
-rw-r--r--data/mllib/sample_lda_libsvm_data.txt12
1 files changed, 12 insertions, 0 deletions
diff --git a/data/mllib/sample_lda_libsvm_data.txt b/data/mllib/sample_lda_libsvm_data.txt
new file mode 100644
index 0000000000..bf118d7d5b
--- /dev/null
+++ b/data/mllib/sample_lda_libsvm_data.txt
@@ -0,0 +1,12 @@
+0 1:1 2:2 3:6 4:0 5:2 6:3 7:1 8:1 9:0 10:0 11:3
+1 1:1 2:3 3:0 4:1 5:3 6:0 7:0 8:2 9:0 10:0 11:1
+2 1:1 2:4 3:1 4:0 5:0 6:4 7:9 8:0 9:1 10:2 11:0
+3 1:2 2:1 3:0 4:3 5:0 6:0 7:5 8:0 9:2 10:3 11:9
+4 1:3 2:1 3:1 4:9 5:3 6:0 7:2 8:0 9:0 10:1 11:3
+5 1:4 2:2 3:0 4:3 5:4 6:5 7:1 8:1 9:1 10:4 11:0
+6 1:2 2:1 3:0 4:3 5:0 6:0 7:5 8:0 9:2 10:2 11:9
+7 1:1 2:1 3:1 4:9 5:2 6:1 7:2 8:0 9:0 10:1 11:3
+8 1:4 2:4 3:0 4:3 5:4 6:2 7:1 8:3 9:0 10:0 11:0
+9 1:2 2:8 3:2 4:0 5:3 6:0 7:2 8:0 9:2 10:7 11:2
+10 1:1 2:1 3:1 4:9 5:0 6:2 7:2 8:0 9:0 10:3 11:3
+11 1:4 2:1 3:0 4:0 5:4 6:5 7:1 8:3 9:0 10:1 11:0