aboutsummaryrefslogtreecommitdiff
path: root/graphx/src
diff options
context:
space:
mode:
authorCarson Wang <carson.wang@intel.com>2016-02-14 16:00:20 -0800
committerReynold Xin <rxin@databricks.com>2016-02-14 16:00:20 -0800
commit7cb4d74c98c2f1765b48a549f62e47b53ed29b38 (patch)
tree919f67c5b5a3053551173e2573ef3661c2160b8e /graphx/src
parent22e9723d6208f2cd2dfa26487ea1c041cb9d7dcd (diff)
downloadspark-7cb4d74c98c2f1765b48a549f62e47b53ed29b38.tar.gz
spark-7cb4d74c98c2f1765b48a549f62e47b53ed29b38.tar.bz2
spark-7cb4d74c98c2f1765b48a549f62e47b53ed29b38.zip
[SPARK-13185][SQL] Reuse Calendar object in DateTimeUtils.StringToDate method to improve performance
The java `Calendar` object is expensive to create. I have a sub query like this `SELECT a, b, c FROM table UV WHERE (datediff(UV.visitDate, '1997-01-01')>=0 AND datediff(UV.visitDate, '2015-01-01')<=0))` The table stores `visitDate` as String type and has 3 billion records. A `Calendar` object is created every time `DateTimeUtils.stringToDate` is called. By reusing the `Calendar` object, I saw about 20 seconds performance improvement for this stage. Author: Carson Wang <carson.wang@intel.com> Closes #11090 from carsonwang/SPARK-13185.
Diffstat (limited to 'graphx/src')
0 files changed, 0 insertions, 0 deletions