[SPARK-17387][PYSPARK] Creating SparkContext() from python without spark-submit ignores user conf - spark

diff options

author	Jeff Zhang <zjffdu@apache.org>	2016-10-11 14:56:26 -0700
committer	Marcelo Vanzin <vanzin@cloudera.com>	2016-10-11 14:56:26 -0700
commit	5b77e66dd6a128c5992ab3bde418613f84be7009 (patch)
tree	2cf1ff007ab933869ba29462b1fbf76f731114f2 /sql/hive
parent	23405f324a8089f86ebcbede9bb32944137508e8 (diff)
download	spark-5b77e66dd6a128c5992ab3bde418613f84be7009.tar.gz spark-5b77e66dd6a128c5992ab3bde418613f84be7009.tar.bz2 spark-5b77e66dd6a128c5992ab3bde418613f84be7009.zip

[SPARK-17387][PYSPARK] Creating SparkContext() from python without spark-submit ignores user conf

## What changes were proposed in this pull request? The root cause that we would ignore SparkConf when launching JVM is that SparkConf require JVM to be created first. https://github.com/apache/spark/blob/master/python/pyspark/conf.py#L106 In this PR, I would defer the launching of JVM until SparkContext is created so that we can pass SparkConf to JVM correctly. ## How was this patch tested? Use the example code in the description of SPARK-17387, ``` $ SPARK_HOME=$PWD PYTHONPATH=python:python/lib/py4j-0.10.3-src.zip python Python 2.7.12 (default, Jul 1 2016, 15:12:24) [GCC 5.4.0 20160609] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> from pyspark import SparkContext >>> from pyspark import SparkConf >>> conf = SparkConf().set("spark.driver.memory", "4g") >>> sc = SparkContext(conf=conf) ``` And verify the spark.driver.memory is correctly picked up. ``` ...op/ -Xmx4g org.apache.spark.deploy.SparkSubmit --conf spark.driver.memory=4g pyspark-shell ``` Author: Jeff Zhang <zjffdu@apache.org> Closes #14959 from zjffdu/SPARK-17387.

Diffstat (limited to 'sql/hive')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: