diff options
author | Cheng Lian <lian.cs.zju@gmail.com> | 2014-06-02 19:20:23 -0700 |
---|---|---|
committer | Michael Armbrust <michael@databricks.com> | 2014-06-02 19:20:23 -0700 |
commit | 862283e9ccace6824880aa4e161723fb3248d438 (patch) | |
tree | a4872987bd7746b7c7af4c370de74e670abc3782 /python/test_support | |
parent | ec8be274a7bc586bb5b025033cbfd73f9a4d7160 (diff) | |
download | spark-862283e9ccace6824880aa4e161723fb3248d438.tar.gz spark-862283e9ccace6824880aa4e161723fb3248d438.tar.bz2 spark-862283e9ccace6824880aa4e161723fb3248d438.zip |
Avoid dynamic dispatching when unwrapping Hive data.
This is a follow up of PR #758.
The `unwrapHiveData` function is now composed statically before actual rows are scanned according to the field object inspector to avoid dynamic dispatching cost.
According to the same micro benchmark used in PR #758, this simple change brings slight performance boost: 2.5% for CSV table and 1% for RCFile table.
```
Optimized version:
CSV: 6870 ms, RCFile: 5687 ms
CSV: 6832 ms, RCFile: 5800 ms
CSV: 6822 ms, RCFile: 5679 ms
CSV: 6704 ms, RCFile: 5758 ms
CSV: 6819 ms, RCFile: 5725 ms
Original version:
CSV: 7042 ms, RCFile: 5667 ms
CSV: 6883 ms, RCFile: 5703 ms
CSV: 7115 ms, RCFile: 5665 ms
CSV: 7020 ms, RCFile: 5981 ms
CSV: 6871 ms, RCFile: 5906 ms
```
Author: Cheng Lian <lian.cs.zju@gmail.com>
Closes #935 from liancheng/staticUnwrapping and squashes the following commits:
c49c70c [Cheng Lian] Avoid dynamic dispatching when unwrapping Hive data.
Diffstat (limited to 'python/test_support')
0 files changed, 0 insertions, 0 deletions