[SPARK-15075][SPARK-15345][SQL] Clean up SparkSession builder and propagate config options to existing sessions if specified

## What changes were proposed in this pull request? Currently SparkSession.Builder use SQLContext.getOrCreate. It should probably the the other way around, i.e. all the core logic goes in SparkSession, and SQLContext just calls that. This patch does that. This patch also makes sure config options specified in the builder are propagated to the existing (and of course the new) SparkSession. ## How was this patch tested? Updated tests to reflect the change, and also introduced a new SparkSessionBuilderSuite that should cover all the branches. Author: Reynold Xin <rxin@databricks.com> Closes #13200 from rxin/SPARK-15075.
author: Reynold Xin <rxin@databricks.com> 2016-05-19 21:53:26 -0700
committer: Reynold Xin <rxin@databricks.com> 2016-05-19 21:53:26 -0700
commit: f2ee0ed4b7ecb2855cc4928a9613a07d45446f4e (patch)
tree: 3c923b935bcf35219f158ed5a8ca34edfb7c9322 /python/pyspark/sql/context.py
parent: 17591d90e6873f30a042112f56a1686726ccbd60 (diff)
download: spark-f2ee0ed4b7ecb2855cc4928a9613a07d45446f4e.tar.gz
spark-f2ee0ed4b7ecb2855cc4928a9613a07d45446f4e.tar.bz2
spark-f2ee0ed4b7ecb2855cc4928a9613a07d45446f4e.zip
1 files changed, 4 insertions, 1 deletions
diff --git a/python/pyspark/sql/context.py b/python/pyspark/sql/context.py
index e8e60c6412..486733a390 100644
--- a/python/pyspark/sql/context.py
+++ b/python/pyspark/sql/context.py
@@ -34,7 +34,10 @@ __all__ = ["SQLContext", "HiveContext", "UDFRegistration"]
 
 
 class SQLContext(object):
-    """Wrapper around :class:`SparkSession`, the main entry point to Spark SQL functionality.
+    """The entry point for working with structured data (rows and columns) in Spark, in Spark 1.x.
+
+    As of Spark 2.0, this is replaced by :class:`SparkSession`. However, we are keeping the class
+    here for backward compatibility.
 
     A SQLContext can be used create :class:`DataFrame`, register :class:`DataFrame` as
     tables, execute SQL over tables, cache tables, and read parquet files.
author	Reynold Xin <rxin@databricks.com>	2016-05-19 21:53:26 -0700
committer	Reynold Xin <rxin@databricks.com>	2016-05-19 21:53:26 -0700
commit	f2ee0ed4b7ecb2855cc4928a9613a07d45446f4e (patch)
tree	3c923b935bcf35219f158ed5a8ca34edfb7c9322 /python/pyspark/sql/context.py
parent	17591d90e6873f30a042112f56a1686726ccbd60 (diff)
download	spark-f2ee0ed4b7ecb2855cc4928a9613a07d45446f4e.tar.gz spark-f2ee0ed4b7ecb2855cc4928a9613a07d45446f4e.tar.bz2 spark-f2ee0ed4b7ecb2855cc4928a9613a07d45446f4e.zip