aboutsummaryrefslogtreecommitdiff
path: root/sql/catalyst/src/main/scala/org/apache
diff options
context:
space:
mode:
authorCheng Lian <lian.cs.zju@gmail.com>2014-09-03 18:57:20 -0700
committerMichael Armbrust <michael@databricks.com>2014-09-03 18:57:20 -0700
commitf48420fde58d554480cc8830d2f8c4d17618f283 (patch)
treebd793ba5bc1e9917f34a7f40daf697346c447393 /sql/catalyst/src/main/scala/org/apache
parent4bba10c41acaf84a1c4a8e2db467c22f5ab7cbb9 (diff)
downloadspark-f48420fde58d554480cc8830d2f8c4d17618f283.tar.gz
spark-f48420fde58d554480cc8830d2f8c4d17618f283.tar.bz2
spark-f48420fde58d554480cc8830d2f8c4d17618f283.zip
[SPARK-2973][SQL] Lightweight SQL commands without distributed jobs when calling .collect()
By overriding `executeCollect()` in physical plan classes of all commands, we can avoid to kick off a distributed job when collecting result of a SQL command, e.g. `sql("SET").collect()`. Previously, `Command.sideEffectResult` returns a `Seq[Any]`, and the `execute()` method in sub-classes of `Command` typically convert that to a `Seq[Row]` then parallelize it to an RDD. Now with this PR, `sideEffectResult` is required to return a `Seq[Row]` directly, so that `executeCollect()` can directly leverage that and be factored to the `Command` parent class. Author: Cheng Lian <lian.cs.zju@gmail.com> Closes #2215 from liancheng/lightweight-commands and squashes the following commits: 3fbef60 [Cheng Lian] Factored execute() method of physical commands to parent class Command 5a0e16c [Cheng Lian] Passes test suites e0e12e9 [Cheng Lian] Refactored Command.sideEffectResult and Command.executeCollect 995bdd8 [Cheng Lian] Cleaned up DescribeHiveTableCommand 542977c [Cheng Lian] Avoids confusion between logical and physical plan by adding package prefixes 55b2aa5 [Cheng Lian] Avoids distributed jobs when execution SQL commands
Diffstat (limited to 'sql/catalyst/src/main/scala/org/apache')
0 files changed, 0 insertions, 0 deletions