diff options
author | Wenchen Fan <cloud0fan@outlook.com> | 2015-08-14 14:09:46 -0700 |
---|---|---|
committer | Michael Armbrust <michael@databricks.com> | 2015-08-14 14:09:46 -0700 |
commit | 1150a19b188a075166899fdb1e107b2ba1e505d8 (patch) | |
tree | b5b45d3285002e3b276d47ac5d5b40c0b11f4ff8 /sql | |
parent | 2a6590e510aba3bfc6603d280023128b3f5ac702 (diff) | |
download | spark-1150a19b188a075166899fdb1e107b2ba1e505d8.tar.gz spark-1150a19b188a075166899fdb1e107b2ba1e505d8.tar.bz2 spark-1150a19b188a075166899fdb1e107b2ba1e505d8.zip |
[SPARK-8670] [SQL] Nested columns can't be referenced in pyspark
This bug is caused by a wrong column-exist-check in `__getitem__` of pyspark dataframe. `DataFrame.apply` accepts not only top level column names, but also nested column name like `a.b`, so we should remove that check from `__getitem__`.
Author: Wenchen Fan <cloud0fan@outlook.com>
Closes #8202 from cloud-fan/nested.
Diffstat (limited to 'sql')
-rw-r--r-- | sql/core/src/main/scala/org/apache/spark/sql/DataFrame.scala | 2 |
1 files changed, 2 insertions, 0 deletions
diff --git a/sql/core/src/main/scala/org/apache/spark/sql/DataFrame.scala b/sql/core/src/main/scala/org/apache/spark/sql/DataFrame.scala index cf75e64e88..fd0ead4401 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/DataFrame.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/DataFrame.scala @@ -634,6 +634,7 @@ class DataFrame private[sql]( /** * Selects column based on the column name and return it as a [[Column]]. + * Note that the column name can also reference to a nested column like `a.b`. * @group dfops * @since 1.3.0 */ @@ -641,6 +642,7 @@ class DataFrame private[sql]( /** * Selects column based on the column name and return it as a [[Column]]. + * Note that the column name can also reference to a nested column like `a.b`. * @group dfops * @since 1.3.0 */ |