Jit: string's compareTo performance improvement.

Changed compareTo handler to call __memcmp16() for strings >= 32 chars.
However, even for those strings, the first two chars are done in the
handler (to catch early-out cases).

Comparisons were done with micro-benchmarks comparing 10 and 200-char
strings.

The strings were:
    equal -> Q
    not equal at start -> S
    not equal at end -> E

The test configurations were handler (H) [the previous handler], subroutine (S)
[memcmp16()} and blended (B) [this commit]

       H         S          B
10E   60        138        65
10S   32         70        30
10Q    9          9         9
100E 745        708       716

In short, the small string cases were twice as fast with the existing
handler compared to memcmp16, but memcmp16 was ~5% faster for long
strings.
5 files changed