diff options
author | Shixiong Zhu <shixiong@databricks.com> | 2017-02-15 20:51:33 -0800 |
---|---|---|
committer | Shixiong Zhu <shixiong@databricks.com> | 2017-02-15 20:51:33 -0800 |
commit | fc02ef95cdfc226603b52dc579b7133631f7143d (patch) | |
tree | 6f8fd69fc139d2b0795fd7aeef365bacff7d2f50 /mllib-local/pom.xml | |
parent | 08c1972a0661d42f300520cc6e5fb31023de093b (diff) | |
download | spark-fc02ef95cdfc226603b52dc579b7133631f7143d.tar.gz spark-fc02ef95cdfc226603b52dc579b7133631f7143d.tar.bz2 spark-fc02ef95cdfc226603b52dc579b7133631f7143d.zip |
[SPARK-19603][SS] Fix StreamingQuery explain command
## What changes were proposed in this pull request?
`StreamingQuery.explain` doesn't show the correct streaming physical plan right now because `ExplainCommand` receives a runtime batch plan and its `logicalPlan.isStreaming` is always false.
This PR adds `streaming` parameter to `ExplainCommand` to allow `StreamExecution` to specify that it's a streaming plan.
Examples of the explain outputs:
- streaming DataFrame.explain()
```
== Physical Plan ==
*HashAggregate(keys=[value#518], functions=[count(1)])
+- StateStoreSave [value#518], OperatorStateId(<unknown>,0,0), Append, 0
+- *HashAggregate(keys=[value#518], functions=[merge_count(1)])
+- StateStoreRestore [value#518], OperatorStateId(<unknown>,0,0)
+- *HashAggregate(keys=[value#518], functions=[merge_count(1)])
+- Exchange hashpartitioning(value#518, 5)
+- *HashAggregate(keys=[value#518], functions=[partial_count(1)])
+- *SerializeFromObject [staticinvoke(class org.apache.spark.unsafe.types.UTF8String, StringType, fromString, input[0, java.lang.String, true], true) AS value#518]
+- *MapElements <function1>, obj#517: java.lang.String
+- *DeserializeToObject value#513.toString, obj#516: java.lang.String
+- StreamingRelation MemoryStream[value#513], [value#513]
```
- StreamingQuery.explain(extended = false)
```
== Physical Plan ==
*HashAggregate(keys=[value#518], functions=[count(1)])
+- StateStoreSave [value#518], OperatorStateId(...,0,0), Complete, 0
+- *HashAggregate(keys=[value#518], functions=[merge_count(1)])
+- StateStoreRestore [value#518], OperatorStateId(...,0,0)
+- *HashAggregate(keys=[value#518], functions=[merge_count(1)])
+- Exchange hashpartitioning(value#518, 5)
+- *HashAggregate(keys=[value#518], functions=[partial_count(1)])
+- *SerializeFromObject [staticinvoke(class org.apache.spark.unsafe.types.UTF8String, StringType, fromString, input[0, java.lang.String, true], true) AS value#518]
+- *MapElements <function1>, obj#517: java.lang.String
+- *DeserializeToObject value#543.toString, obj#516: java.lang.String
+- LocalTableScan [value#543]
```
- StreamingQuery.explain(extended = true)
```
== Parsed Logical Plan ==
Aggregate [value#518], [value#518, count(1) AS count(1)#524L]
+- SerializeFromObject [staticinvoke(class org.apache.spark.unsafe.types.UTF8String, StringType, fromString, input[0, java.lang.String, true], true) AS value#518]
+- MapElements <function1>, class java.lang.String, [StructField(value,StringType,true)], obj#517: java.lang.String
+- DeserializeToObject cast(value#543 as string).toString, obj#516: java.lang.String
+- LocalRelation [value#543]
== Analyzed Logical Plan ==
value: string, count(1): bigint
Aggregate [value#518], [value#518, count(1) AS count(1)#524L]
+- SerializeFromObject [staticinvoke(class org.apache.spark.unsafe.types.UTF8String, StringType, fromString, input[0, java.lang.String, true], true) AS value#518]
+- MapElements <function1>, class java.lang.String, [StructField(value,StringType,true)], obj#517: java.lang.String
+- DeserializeToObject cast(value#543 as string).toString, obj#516: java.lang.String
+- LocalRelation [value#543]
== Optimized Logical Plan ==
Aggregate [value#518], [value#518, count(1) AS count(1)#524L]
+- SerializeFromObject [staticinvoke(class org.apache.spark.unsafe.types.UTF8String, StringType, fromString, input[0, java.lang.String, true], true) AS value#518]
+- MapElements <function1>, class java.lang.String, [StructField(value,StringType,true)], obj#517: java.lang.String
+- DeserializeToObject value#543.toString, obj#516: java.lang.String
+- LocalRelation [value#543]
== Physical Plan ==
*HashAggregate(keys=[value#518], functions=[count(1)], output=[value#518, count(1)#524L])
+- StateStoreSave [value#518], OperatorStateId(...,0,0), Complete, 0
+- *HashAggregate(keys=[value#518], functions=[merge_count(1)], output=[value#518, count#530L])
+- StateStoreRestore [value#518], OperatorStateId(...,0,0)
+- *HashAggregate(keys=[value#518], functions=[merge_count(1)], output=[value#518, count#530L])
+- Exchange hashpartitioning(value#518, 5)
+- *HashAggregate(keys=[value#518], functions=[partial_count(1)], output=[value#518, count#530L])
+- *SerializeFromObject [staticinvoke(class org.apache.spark.unsafe.types.UTF8String, StringType, fromString, input[0, java.lang.String, true], true) AS value#518]
+- *MapElements <function1>, obj#517: java.lang.String
+- *DeserializeToObject value#543.toString, obj#516: java.lang.String
+- LocalTableScan [value#543]
```
## How was this patch tested?
The updated unit test.
Author: Shixiong Zhu <shixiong@databricks.com>
Closes #16934 from zsxwing/SPARK-19603.
Diffstat (limited to 'mllib-local/pom.xml')
0 files changed, 0 insertions, 0 deletions