[SPARK-3831] [SQL] Filter rule Improvement and bool expression optimization.

If we write the filter which is always FALSE like SELECT * from person WHERE FALSE; 200 tasks will run. I think, 1 task is enough. And current optimizer cannot optimize the case NOT is duplicated like SELECT * from person WHERE NOT ( NOT (age > 30)); The filter rule above should be simplified Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp> Closes #2692 from sarutak/SPARK-3831 and squashes the following commits: 25f3e20 [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into SPARK-3831 23c750c [Kousuke Saruta] Improved unsupported predicate test case a11b9f3 [Kousuke Saruta] Modified NOT predicate test case in PartitionBatchPruningSuite 8ea872b [Kousuke Saruta] Fixed the number of tasks when the data of LocalRelation is empty.
author: Kousuke Saruta <sarutak@oss.nttdata.co.jp> 2014-10-08 17:03:47 -0700
committer: Michael Armbrust <michael@databricks.com> 2014-10-08 17:03:47 -0700
commit: a85f24accd3266e0f97ee04d03c22b593d99c062 (patch)
tree: 13c80dabf7f3e5e79120b089be2de685916cbd1b /sql/core/src/main
parent: add174aa56d291bc48ef73a42c39428c923efe31 (diff)
download: spark-a85f24accd3266e0f97ee04d03c22b593d99c062.tar.gz
spark-a85f24accd3266e0f97ee04d03c22b593d99c062.tar.bz2
spark-a85f24accd3266e0f97ee04d03c22b593d99c062.zip
1 files changed, 2 insertions, 1 deletions
diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala
index 5c16d0c624..883f2ff521 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala
@@ -274,9 +274,10 @@ private[sql] abstract class SparkStrategies extends QueryPlanner[SparkPlan] {
         execution.Sample(fraction, withReplacement, seed, planLater(child)) :: Nil
       case SparkLogicalPlan(alreadyPlanned) => alreadyPlanned :: Nil
       case logical.LocalRelation(output, data) =>
+        val nPartitions = if (data.isEmpty) 1 else numPartitions
         PhysicalRDD(
           output,
-          RDDConversions.productToRowRdd(sparkContext.parallelize(data, numPartitions))) :: Nil
+          RDDConversions.productToRowRdd(sparkContext.parallelize(data, nPartitions))) :: Nil
       case logical.Limit(IntegerLiteral(limit), child) =>
         execution.Limit(limit, planLater(child)) :: Nil
       case Unions(unionChildren) =>
author	Kousuke Saruta <sarutak@oss.nttdata.co.jp>	2014-10-08 17:03:47 -0700
committer	Michael Armbrust <michael@databricks.com>	2014-10-08 17:03:47 -0700
commit	a85f24accd3266e0f97ee04d03c22b593d99c062 (patch)
tree	13c80dabf7f3e5e79120b089be2de685916cbd1b /sql/core/src/main
parent	add174aa56d291bc48ef73a42c39428c923efe31 (diff)
download	spark-a85f24accd3266e0f97ee04d03c22b593d99c062.tar.gz spark-a85f24accd3266e0f97ee04d03c22b593d99c062.tar.bz2 spark-a85f24accd3266e0f97ee04d03c22b593d99c062.zip