diff options
author | Jeff Zhang <zjffdu@apache.org> | 2016-10-11 14:56:26 -0700 |
---|---|---|
committer | Marcelo Vanzin <vanzin@cloudera.com> | 2016-10-11 14:56:26 -0700 |
commit | 5b77e66dd6a128c5992ab3bde418613f84be7009 (patch) | |
tree | 2cf1ff007ab933869ba29462b1fbf76f731114f2 /sql/hive | |
parent | 23405f324a8089f86ebcbede9bb32944137508e8 (diff) | |
download | spark-5b77e66dd6a128c5992ab3bde418613f84be7009.tar.gz spark-5b77e66dd6a128c5992ab3bde418613f84be7009.tar.bz2 spark-5b77e66dd6a128c5992ab3bde418613f84be7009.zip |
[SPARK-17387][PYSPARK] Creating SparkContext() from python without spark-submit ignores user conf
## What changes were proposed in this pull request?
The root cause that we would ignore SparkConf when launching JVM is that SparkConf require JVM to be created first. https://github.com/apache/spark/blob/master/python/pyspark/conf.py#L106
In this PR, I would defer the launching of JVM until SparkContext is created so that we can pass SparkConf to JVM correctly.
## How was this patch tested?
Use the example code in the description of SPARK-17387,
```
$ SPARK_HOME=$PWD PYTHONPATH=python:python/lib/py4j-0.10.3-src.zip python
Python 2.7.12 (default, Jul 1 2016, 15:12:24)
[GCC 5.4.0 20160609] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from pyspark import SparkContext
>>> from pyspark import SparkConf
>>> conf = SparkConf().set("spark.driver.memory", "4g")
>>> sc = SparkContext(conf=conf)
```
And verify the spark.driver.memory is correctly picked up.
```
...op/ -Xmx4g org.apache.spark.deploy.SparkSubmit --conf spark.driver.memory=4g pyspark-shell
```
Author: Jeff Zhang <zjffdu@apache.org>
Closes #14959 from zjffdu/SPARK-17387.
Diffstat (limited to 'sql/hive')
0 files changed, 0 insertions, 0 deletions