[SPARK-14945][PYTHON] SparkSession Python API

## What changes were proposed in this pull request? ``` Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /__ / .__/\_,_/_/ /_/\_\ version 2.0.0-SNAPSHOT /_/ Using Python version 2.7.5 (default, Mar 9 2014 22:15:05) SparkSession available as 'spark'. >>> spark <pyspark.sql.session.SparkSession object at 0x101f3bfd0> >>> spark.sql("SHOW TABLES").show() ... +---------+-----------+ |tableName|isTemporary| +---------+-----------+ | src| false| +---------+-----------+ >>> spark.range(1, 10, 2).show() +---+ | id| +---+ | 1| | 3| | 5| | 7| | 9| +---+ ``` **Note**: This API is NOT complete in its current state. In particular, for now I left out the `conf` and `catalog` APIs, which were added later in Scala. These will be added later before 2.0. ## How was this patch tested? Python tests. Author: Andrew Or <andrew@databricks.com> Closes #12746 from andrewor14/python-spark-session.
author: Andrew Or <andrew@databricks.com> 2016-04-28 10:55:48 -0700
committer: Reynold Xin <rxin@databricks.com> 2016-04-28 10:55:48 -0700
commit: 89addd40abdacd65cc03ac8aa5f9cf3dd4a4c19b (patch)
tree: 5ecd3d9a736333c7951de6159eefef86129e3744 /python/pyspark/shell.py
parent: 5743352a28fffbfbaca2201208ce7a1d7893f813 (diff)
download: spark-89addd40abdacd65cc03ac8aa5f9cf3dd4a4c19b.tar.gz
spark-89addd40abdacd65cc03ac8aa5f9cf3dd4a4c19b.tar.bz2
spark-89addd40abdacd65cc03ac8aa5f9cf3dd4a4c19b.zip
1 files changed, 6 insertions, 5 deletions
diff --git a/python/pyspark/shell.py b/python/pyspark/shell.py
index 7c37f75193..c6b0eda996 100644
--- a/python/pyspark/shell.py
+++ b/python/pyspark/shell.py
@@ -29,7 +29,7 @@ import py4j
 
 import pyspark
 from pyspark.context import SparkContext
-from pyspark.sql import SQLContext, HiveContext
+from pyspark.sql import SparkSession, SQLContext
 from pyspark.storagelevel import StorageLevel
 
 if os.environ.get("SPARK_EXECUTOR_URI"):
@@ -41,13 +41,14 @@ atexit.register(lambda: sc.stop())
 try:
     # Try to access HiveConf, it will raise exception if Hive is not added
     sc._jvm.org.apache.hadoop.hive.conf.HiveConf()
-    sqlContext = HiveContext(sc)
+    spark = SparkSession.withHiveSupport(sc)
 except py4j.protocol.Py4JError:
-    sqlContext = SQLContext(sc)
+    spark = SparkSession(sc)
 except TypeError:
-    sqlContext = SQLContext(sc)
+    spark = SparkSession(sc)
 
 # for compatibility
+sqlContext = spark._wrapped
 sqlCtx = sqlContext
 
 print("""Welcome to
@@ -61,7 +62,7 @@ print("Using Python version %s (%s, %s)" % (
     platform.python_version(),
     platform.python_build()[0],
     platform.python_build()[1]))
-print("SparkContext available as sc, %s available as sqlContext." % sqlContext.__class__.__name__)
+print("SparkSession available as 'spark'.")
 
 # The ./bin/pyspark script stores the old PYTHONSTARTUP value in OLD_PYTHONSTARTUP,
 # which allows us to execute the user's PYTHONSTARTUP file:
author	Andrew Or <andrew@databricks.com>	2016-04-28 10:55:48 -0700
committer	Reynold Xin <rxin@databricks.com>	2016-04-28 10:55:48 -0700
commit	89addd40abdacd65cc03ac8aa5f9cf3dd4a4c19b (patch)
tree	5ecd3d9a736333c7951de6159eefef86129e3744 /python/pyspark/shell.py
parent	5743352a28fffbfbaca2201208ce7a1d7893f813 (diff)
download	spark-89addd40abdacd65cc03ac8aa5f9cf3dd4a4c19b.tar.gz spark-89addd40abdacd65cc03ac8aa5f9cf3dd4a4c19b.tar.bz2 spark-89addd40abdacd65cc03ac8aa5f9cf3dd4a4c19b.zip