aboutsummaryrefslogtreecommitdiff
path: root/python/pyspark
diff options
context:
space:
mode:
authorPatrick Wendell <patrick@databricks.com>2015-04-29 00:35:08 -0700
committerReynold Xin <rxin@databricks.com>2015-04-29 00:35:08 -0700
commit1fd6ed9a56ac4671f4a3d25a42823ba3bf01f60f (patch)
tree22c720bcc9e24ffbdc6724bd5489225c4e9c3643 /python/pyspark
parentfe917f5ec9be8c8424416f7b5423ddb4318e03a0 (diff)
downloadspark-1fd6ed9a56ac4671f4a3d25a42823ba3bf01f60f.tar.gz
spark-1fd6ed9a56ac4671f4a3d25a42823ba3bf01f60f.tar.bz2
spark-1fd6ed9a56ac4671f4a3d25a42823ba3bf01f60f.zip
[SPARK-7204] [SQL] Fix callSite for Dataframe and SQL operations
This patch adds SQL to the set of excluded libraries when generating a callSite. This makes the callSite mechanism work properly for the data frame API. I also added a small improvement for JDBC queries where we just use the string "Spark JDBC Server Query" instead of trying to give a callsite that doesn't make any sense to the user. Before (DF): ![screen shot 2015-04-28 at 1 29 26 pm](https://cloud.githubusercontent.com/assets/320616/7380170/ef63bfb0-edae-11e4-989c-f88a5ba6bbee.png) After (DF): ![screen shot 2015-04-28 at 1 34 58 pm](https://cloud.githubusercontent.com/assets/320616/7380181/fa7f6d90-edae-11e4-9559-26f163ed63b8.png) After (JDBC): ![screen shot 2015-04-28 at 2 00 10 pm](https://cloud.githubusercontent.com/assets/320616/7380185/02f5b2a4-edaf-11e4-8e5b-99bdc3df66dd.png) Author: Patrick Wendell <patrick@databricks.com> Closes #5757 from pwendell/dataframes and squashes the following commits: 0d931a4 [Patrick Wendell] Attempting to fix PySpark tests 85bf740 [Patrick Wendell] [SPARK-7204] Fix callsite for dataframe operations.
Diffstat (limited to 'python/pyspark')
-rw-r--r--python/pyspark/sql/dataframe.py3
1 files changed, 2 insertions, 1 deletions
diff --git a/python/pyspark/sql/dataframe.py b/python/pyspark/sql/dataframe.py
index 4759f5fe78..6879fe0805 100644
--- a/python/pyspark/sql/dataframe.py
+++ b/python/pyspark/sql/dataframe.py
@@ -237,7 +237,8 @@ class DataFrame(object):
:param extended: boolean, default ``False``. If ``False``, prints only the physical plan.
>>> df.explain()
- PhysicalRDD [age#0,name#1], MapPartitionsRDD[...] at mapPartitions at SQLContext.scala:...
+ PhysicalRDD [age#0,name#1], MapPartitionsRDD[...] at applySchemaToPythonRDD at\
+ NativeMethodAccessorImpl.java:...
>>> df.explain(True)
== Parsed Logical Plan ==