diff options
author | Shuai Lin <linshuai2012@gmail.com> | 2016-12-07 06:09:27 +0800 |
---|---|---|
committer | Sean Owen <sowen@cloudera.com> | 2016-12-07 06:09:27 +0800 |
commit | bd9a4a5ac3abcc48131d1249df55e7d68266343a (patch) | |
tree | f0e912b499d92c696b7eb829209fb56da35d6059 /pom.xml | |
parent | eeed38eaf8c6912f3c51ba83903b67835a699f86 (diff) | |
download | spark-bd9a4a5ac3abcc48131d1249df55e7d68266343a.tar.gz spark-bd9a4a5ac3abcc48131d1249df55e7d68266343a.tar.bz2 spark-bd9a4a5ac3abcc48131d1249df55e7d68266343a.zip |
[SPARK-18652][PYTHON] Include the example data and third-party licenses in pyspark package.
## What changes were proposed in this pull request?
Since we already include the python examples in the pyspark package, we should include the example data with it as well.
We should also include the third-party licences since we distribute their jars with the pyspark package.
## How was this patch tested?
Manually tested with python2.7 and python3.4
```sh
$ ./build/mvn -DskipTests -Phive -Phive-thriftserver -Pyarn -Pmesos clean package
$ cd python
$ python setup.py sdist
$ pip install dist/pyspark-2.1.0.dev0.tar.gz
$ ls -1 /usr/local/lib/python2.7/dist-packages/pyspark/data/
graphx
mllib
streaming
$ du -sh /usr/local/lib/python2.7/dist-packages/pyspark/data/
600K /usr/local/lib/python2.7/dist-packages/pyspark/data/
$ ls -1 /usr/local/lib/python2.7/dist-packages/pyspark/licenses/|head -5
LICENSE-AnchorJS.txt
LICENSE-DPark.txt
LICENSE-Mockito.txt
LICENSE-SnapTree.txt
LICENSE-antlr.txt
```
Author: Shuai Lin <linshuai2012@gmail.com>
Closes #16082 from lins05/include-data-in-pyspark-dist.
Diffstat (limited to 'pom.xml')
0 files changed, 0 insertions, 0 deletions