diff options
author | Patrick Wendell <patrick@databricks.com> | 2015-04-29 00:35:08 -0700 |
---|---|---|
committer | Reynold Xin <rxin@databricks.com> | 2015-04-29 00:35:08 -0700 |
commit | 1fd6ed9a56ac4671f4a3d25a42823ba3bf01f60f (patch) | |
tree | 22c720bcc9e24ffbdc6724bd5489225c4e9c3643 /python/pyspark | |
parent | fe917f5ec9be8c8424416f7b5423ddb4318e03a0 (diff) | |
download | spark-1fd6ed9a56ac4671f4a3d25a42823ba3bf01f60f.tar.gz spark-1fd6ed9a56ac4671f4a3d25a42823ba3bf01f60f.tar.bz2 spark-1fd6ed9a56ac4671f4a3d25a42823ba3bf01f60f.zip |
[SPARK-7204] [SQL] Fix callSite for Dataframe and SQL operations
This patch adds SQL to the set of excluded libraries when
generating a callSite. This makes the callSite mechanism work
properly for the data frame API. I also added a small improvement for
JDBC queries where we just use the string "Spark JDBC Server Query"
instead of trying to give a callsite that doesn't make any sense
to the user.
Before (DF):
![screen shot 2015-04-28 at 1 29 26 pm](https://cloud.githubusercontent.com/assets/320616/7380170/ef63bfb0-edae-11e4-989c-f88a5ba6bbee.png)
After (DF):
![screen shot 2015-04-28 at 1 34 58 pm](https://cloud.githubusercontent.com/assets/320616/7380181/fa7f6d90-edae-11e4-9559-26f163ed63b8.png)
After (JDBC):
![screen shot 2015-04-28 at 2 00 10 pm](https://cloud.githubusercontent.com/assets/320616/7380185/02f5b2a4-edaf-11e4-8e5b-99bdc3df66dd.png)
Author: Patrick Wendell <patrick@databricks.com>
Closes #5757 from pwendell/dataframes and squashes the following commits:
0d931a4 [Patrick Wendell] Attempting to fix PySpark tests
85bf740 [Patrick Wendell] [SPARK-7204] Fix callsite for dataframe operations.
Diffstat (limited to 'python/pyspark')
-rw-r--r-- | python/pyspark/sql/dataframe.py | 3 |
1 files changed, 2 insertions, 1 deletions
diff --git a/python/pyspark/sql/dataframe.py b/python/pyspark/sql/dataframe.py index 4759f5fe78..6879fe0805 100644 --- a/python/pyspark/sql/dataframe.py +++ b/python/pyspark/sql/dataframe.py @@ -237,7 +237,8 @@ class DataFrame(object): :param extended: boolean, default ``False``. If ``False``, prints only the physical plan. >>> df.explain() - PhysicalRDD [age#0,name#1], MapPartitionsRDD[...] at mapPartitions at SQLContext.scala:... + PhysicalRDD [age#0,name#1], MapPartitionsRDD[...] at applySchemaToPythonRDD at\ + NativeMethodAccessorImpl.java:... >>> df.explain(True) == Parsed Logical Plan == |