diff options
author | zhichao.li <zhichao.li@intel.com> | 2015-07-24 08:34:50 -0700 |
---|---|---|
committer | Davies Liu <davies.liu@gmail.com> | 2015-07-24 08:34:50 -0700 |
commit | 846cf46282da8f4b87aeee64e407a38cdc80e13b (patch) | |
tree | 434f4e530e122cff99eb9912e1a00d9df885ffe8 /unsafe | |
parent | dfb18be0366376be3b928dbf4570448c60fe652b (diff) | |
download | spark-846cf46282da8f4b87aeee64e407a38cdc80e13b.tar.gz spark-846cf46282da8f4b87aeee64e407a38cdc80e13b.tar.bz2 spark-846cf46282da8f4b87aeee64e407a38cdc80e13b.zip |
[SPARK-9238] [SQL] Remove two extra useless entries for bytesOfCodePointInUTF8
Only a trial thing, not sure if I understand correctly or not but I guess only 2 entries in `bytesOfCodePointInUTF8` for the case of 6 bytes codepoint(1111110x) is enough.
Details can be found from https://en.wikipedia.org/wiki/UTF-8 in "Description" section.
Author: zhichao.li <zhichao.li@intel.com>
Closes #7582 from zhichao-li/utf8 and squashes the following commits:
8bddd01 [zhichao.li] two extra entries
Diffstat (limited to 'unsafe')
-rw-r--r-- | unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java b/unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java index 946d355f1f..6d8dcb1cbf 100644 --- a/unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java +++ b/unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java @@ -48,7 +48,7 @@ public final class UTF8String implements Comparable<UTF8String>, Serializable { 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, - 6, 6, 6, 6}; + 6, 6}; public static final UTF8String EMPTY_UTF8 = UTF8String.fromString(""); |