aboutsummaryrefslogtreecommitdiff
path: root/dev
diff options
context:
space:
mode:
authorMichael Armbrust <michael@databricks.com>2014-06-07 14:20:33 -0700
committerReynold Xin <rxin@apache.org>2014-06-07 14:20:33 -0700
commita6c72ab16e7a3027739ab419819f5222e270838e (patch)
treee7a6f7f43b3ab0a4c0bcca38fb662fd7880644f6 /dev
parent41c4a33105c74417192925db355019ba1badeab2 (diff)
downloadspark-a6c72ab16e7a3027739ab419819f5222e270838e.tar.gz
spark-a6c72ab16e7a3027739ab419819f5222e270838e.tar.bz2
spark-a6c72ab16e7a3027739ab419819f5222e270838e.zip
[SPARK-1994][SQL] Weird data corruption bug when running Spark SQL on data in HDFS
Basically there is a race condition (possibly a scala bug?) when these values are recomputed on all of the slaves that results in an incorrect projection being generated (possibly because the GUID uniqueness contract is broken?). In general we should probably enforce that all expression planing occurs on the driver, as is now occurring here. Author: Michael Armbrust <michael@databricks.com> Closes #1004 from marmbrus/fixAggBug and squashes the following commits: e0c116c [Michael Armbrust] Compute aggregate expression during planning instead of lazily on workers.
Diffstat (limited to 'dev')
0 files changed, 0 insertions, 0 deletions