diff options
author | Lianhui Wang <lianhuiwang09@gmail.com> | 2015-05-08 08:44:46 -0500 |
---|---|---|
committer | Thomas Graves <tgraves@apache.org> | 2015-05-08 08:44:46 -0500 |
commit | ebff7327af5efa9f57c605284de4fae6b050ae0f (patch) | |
tree | ed814cf7c903da2404a55b4c5bdcc8eb0648f518 /assembly/pom.xml | |
parent | c2f0821aad3b82dcd327e914c9b297e92526649d (diff) | |
download | spark-ebff7327af5efa9f57c605284de4fae6b050ae0f.tar.gz spark-ebff7327af5efa9f57c605284de4fae6b050ae0f.tar.bz2 spark-ebff7327af5efa9f57c605284de4fae6b050ae0f.zip |
[SPARK-6869] [PYSPARK] Add pyspark archives path to PYTHONPATH
Based on https://github.com/apache/spark/pull/5478 that provide a PYSPARK_ARCHIVES_PATH env. within this PR, we just should export PYSPARK_ARCHIVES_PATH=/user/spark/pyspark.zip,/user/spark/python/lib/py4j-0.8.2.1-src.zip in conf/spark-env.sh when we don't install PySpark on each node of Yarn. i run python application successfully on yarn-client and yarn-cluster with this PR.
andrewor14 sryza Sephiroth-Lin Can you take a look at this?thanks.
Author: Lianhui Wang <lianhuiwang09@gmail.com>
Closes #5580 from lianhuiwang/SPARK-6869 and squashes the following commits:
66ffa43 [Lianhui Wang] Update Client.scala
c2ad0f9 [Lianhui Wang] Update Client.scala
1c8f664 [Lianhui Wang] Merge remote-tracking branch 'remotes/apache/master' into SPARK-6869
008850a [Lianhui Wang] Merge remote-tracking branch 'remotes/apache/master' into SPARK-6869
f0b4ed8 [Lianhui Wang] Merge remote-tracking branch 'remotes/apache/master' into SPARK-6869
150907b [Lianhui Wang] Merge remote-tracking branch 'remotes/apache/master' into SPARK-6869
20402cd [Lianhui Wang] use ZipEntry
9d87c3f [Lianhui Wang] update scala style
e7bd971 [Lianhui Wang] address vanzin's comments
4b8a3ed [Lianhui Wang] use pyArchivesEnvOpt
e6b573b [Lianhui Wang] address vanzin's comments
f11f84a [Lianhui Wang] zip pyspark archives
5192cca [Lianhui Wang] update import path
3b1e4c8 [Lianhui Wang] address tgravescs's comments
9396346 [Lianhui Wang] put zip to make-distribution.sh
0d2baf7 [Lianhui Wang] update import paths
e0179be [Lianhui Wang] add zip pyspark archives in build or sparksubmit
31e8e06 [Lianhui Wang] update code style
9f31dac [Lianhui Wang] update code and add comments
f72987c [Lianhui Wang] add archives path to PYTHONPATH
Diffstat (limited to 'assembly/pom.xml')
-rw-r--r-- | assembly/pom.xml | 21 |
1 files changed, 21 insertions, 0 deletions
diff --git a/assembly/pom.xml b/assembly/pom.xml index 2b4d0a990b..626c8577e3 100644 --- a/assembly/pom.xml +++ b/assembly/pom.xml @@ -92,6 +92,27 @@ <skip>true</skip> </configuration> </plugin> + <!-- zip pyspark archives to run python application on yarn mode --> + <plugin> + <groupId>org.apache.maven.plugins</groupId> + <artifactId>maven-antrun-plugin</artifactId> + <executions> + <execution> + <phase>package</phase> + <goals> + <goal>run</goal> + </goals> + </execution> + </executions> + <configuration> + <target> + <delete dir="${basedir}/../python/lib/pyspark.zip"/> + <zip destfile="${basedir}/../python/lib/pyspark.zip"> + <fileset dir="${basedir}/../python/" includes="pyspark/**/*"/> + </zip> + </target> + </configuration> + </plugin> <!-- Use the shade plugin to create a big JAR with all the dependencies --> <plugin> <groupId>org.apache.maven.plugins</groupId> |