diff options
author | Davies Liu <davies@databricks.com> | 2015-06-30 10:48:49 -0700 |
---|---|---|
committer | Davies Liu <davies@databricks.com> | 2015-06-30 10:48:49 -0700 |
commit | fbb267ed6fe799a58f88c2fba2d41e954e5f1547 (patch) | |
tree | 19f499236591e938de23399880a9e42a55a0ad46 /sql/hive | |
parent | 5fa0863626aaf5a9a41756a0b1ec82bddccbf067 (diff) | |
download | spark-fbb267ed6fe799a58f88c2fba2d41e954e5f1547.tar.gz spark-fbb267ed6fe799a58f88c2fba2d41e954e5f1547.tar.bz2 spark-fbb267ed6fe799a58f88c2fba2d41e954e5f1547.zip |
[SPARK-8713] Make codegen thread safe
Codegen takes three steps:
1. Take a list of expressions, convert them into Java source code and a list of expressions that don't not support codegen (fallback to interpret mode).
2. Compile the Java source into Java class (bytecode)
3. Using the Java class and the list of expression to build a Projection.
Currently, we cache the whole three steps, the key is a list of expression, result is projection. Because some of expressions (which may not thread-safe, for example, Random) will be hold by the Projection, the projection maybe not thread safe.
This PR change to only cache the second step, then we can build projection using codegen even some expressions are not thread-safe, because the cache will not hold any expression anymore.
cc marmbrus rxin JoshRosen
Author: Davies Liu <davies@databricks.com>
Closes #7101 from davies/codegen_safe and squashes the following commits:
7dd41f1 [Davies Liu] Merge branch 'master' of github.com:apache/spark into codegen_safe
847bd08 [Davies Liu] don't use scala.refect
4ddaaed [Davies Liu] Merge branch 'master' of github.com:apache/spark into codegen_safe
1793cf1 [Davies Liu] make codegen thread safe
Diffstat (limited to 'sql/hive')
-rw-r--r-- | sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveUDFs.scala | 4 |
1 files changed, 0 insertions, 4 deletions
diff --git a/sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveUDFs.scala b/sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveUDFs.scala index d7827d56ca..4dea561ae5 100644 --- a/sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveUDFs.scala +++ b/sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveUDFs.scala @@ -120,8 +120,6 @@ private[hive] case class HiveSimpleUDF(funcWrapper: HiveFunctionWrapper, childre @transient protected lazy val cached: Array[AnyRef] = new Array[AnyRef](children.length) - override def isThreadSafe: Boolean = false - // TODO: Finish input output types. override def eval(input: InternalRow): Any = { unwrap( @@ -180,8 +178,6 @@ private[hive] case class HiveGenericUDF(funcWrapper: HiveFunctionWrapper, childr lazy val dataType: DataType = inspectorToDataType(returnInspector) - override def isThreadSafe: Boolean = false - override def eval(input: InternalRow): Any = { returnInspector // Make sure initialized. |