aboutsummaryrefslogtreecommitdiff
path: root/mllib
diff options
context:
space:
mode:
authorWenchen Fan <wenchen@databricks.com>2016-06-03 00:43:02 -0700
committerCheng Lian <lian@databricks.com>2016-06-03 00:43:02 -0700
commit190ff274fd71662023a804cf98400c71f9f7da4f (patch)
tree9b3f79aebf252d3c27f53d9593000c5fd58e1509 /mllib
parentb9fcfb3bd14592ac9f1a8e5c2bb31412b9603b60 (diff)
downloadspark-190ff274fd71662023a804cf98400c71f9f7da4f.tar.gz
spark-190ff274fd71662023a804cf98400c71f9f7da4f.tar.bz2
spark-190ff274fd71662023a804cf98400c71f9f7da4f.zip
[SPARK-15494][SQL] encoder code cleanup
## What changes were proposed in this pull request? Our encoder framework has been evolved a lot, this PR tries to clean up the code to make it more readable and emphasise the concept that encoder should be used as a container of serde expressions. 1. move validation logic to analyzer instead of encoder 2. only have a `resolveAndBind` method in encoder instead of `resolve` and `bind`, as we don't have the encoder life cycle concept anymore. 3. `Dataset` don't need to keep a resolved encoder, as there is no such concept anymore. bound encoder is still needed to do serialization outside of query framework. 4. Using `BoundReference` to represent an unresolved field in deserializer expression is kind of weird, this PR adds a `GetColumnByOrdinal` for this purpose. (serializer expression still use `BoundReference`, we can replace it with `GetColumnByOrdinal` in follow-ups) ## How was this patch tested? existing test Author: Wenchen Fan <wenchen@databricks.com> Author: Cheng Lian <lian@databricks.com> Closes #13269 from cloud-fan/clean-encoder.
Diffstat (limited to 'mllib')
-rw-r--r--mllib/src/test/scala/org/apache/spark/mllib/linalg/UDTSerializationBenchmark.scala2
1 files changed, 1 insertions, 1 deletions
diff --git a/mllib/src/test/scala/org/apache/spark/mllib/linalg/UDTSerializationBenchmark.scala b/mllib/src/test/scala/org/apache/spark/mllib/linalg/UDTSerializationBenchmark.scala
index be7110ad6b..8b439e6b7a 100644
--- a/mllib/src/test/scala/org/apache/spark/mllib/linalg/UDTSerializationBenchmark.scala
+++ b/mllib/src/test/scala/org/apache/spark/mllib/linalg/UDTSerializationBenchmark.scala
@@ -29,7 +29,7 @@ object UDTSerializationBenchmark {
val iters = 1e2.toInt
val numRows = 1e3.toInt
- val encoder = ExpressionEncoder[Vector].defaultBinding
+ val encoder = ExpressionEncoder[Vector].resolveAndBind()
val vectors = (1 to numRows).map { i =>
Vectors.dense(Array.fill(1e5.toInt)(1.0 * i))