Changed isPrint for U+00AD SOFT HYPHEN to return true.
Summary:
This is consistent with MacOSX implementation, and most terminals
actually display this character (checked on gnome-terminal, lxterminal, lxterm,
Terminal.app, iterm2). Actually, this is in line with the ISO Latin 1 standard
(ISO 8859-1), which defines it differently from the Unicode Standard. More
information here: http://www.cs.tut.fi/~jkorpela/shy.html
Reviewers: gribozavr, jordan_rose
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1310
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187949 91177308-0d34-0410-b5e6-96231b3b80d8
diff --git a/unittests/Support/LocaleTest.cpp b/unittests/Support/LocaleTest.cpp
index 3524b4b..ad90bdf 100644
--- a/unittests/Support/LocaleTest.cpp
+++ b/unittests/Support/LocaleTest.cpp
@@ -32,6 +32,12 @@
EXPECT_EQ(-1, columnWidth("aaaaaaaaaa\x01"));
EXPECT_EQ(-1, columnWidth("\342\200\213")); // 200B ZERO WIDTH SPACE
+ // 00AD SOFT HYPHEN is displayed on most terminals as a space or a dash. Some
+ // text editors display it only when a line is broken at it, some use it as a
+ // line-break hint, but don't display. We choose terminal-oriented
+ // interpretation.
+ EXPECT_EQ(1, columnWidth("\302\255"));
+
EXPECT_EQ(0, columnWidth("\314\200")); // 0300 COMBINING GRAVE ACCENT
EXPECT_EQ(1, columnWidth("\340\270\201")); // 0E01 THAI CHARACTER KO KAI
EXPECT_EQ(2, columnWidth("\344\270\200")); // CJK UNIFIED IDEOGRAPH-4E00
@@ -72,10 +78,8 @@
EXPECT_EQ(false, isPrint(0x9F));
EXPECT_EQ(true, isPrint(0xAC));
- // FIXME: Figure out if we want to treat SOFT HYPHEN as printable character.
-#ifndef __APPLE__
- EXPECT_EQ(false, isPrint(0xAD)); // SOFT HYPHEN
-#endif // __APPLE__
+ EXPECT_EQ(true, isPrint(0xAD)); // SOFT HYPHEN is displayed on most terminals
+ // as either a space or a dash.
EXPECT_EQ(true, isPrint(0xAE));
// MacOS implementation doesn't think it's printable.