Unicode 9 updates (#70)
* Updates for Unicode 9.0.0 TR29 Changes - New rules GB10/(12/13) are used to combine emoji-zwj sequences/ (force grapheme breaks every two RI codepoints). Unfortunately this breaks statelessness of grapheme-boundary determination. Deal with this by ignoring the problem in utf8proc_grapheme_break, and by hacking in a special case in decompose - ZWJ moved to its own boundclass, update what is now GB9 accordingly. - Add comments to indicate which rule a given case implements - The Number of bound classes Now exceeds 4 bits, expand to 8 and reorganize fields * Import Unicode 9 data * Update Grapheme break API to expose state override * Bump MAJOR version
This commit is contained in:
committed by
Steven G. Johnson
parent
3d0576a9b9
commit
41c6b23aab
6
MANIFEST
6
MANIFEST
@@ -2,6 +2,6 @@ include/
|
||||
include/utf8proc.h
|
||||
lib/
|
||||
lib/libutf8proc.a
|
||||
lib/libutf8proc.so -> libutf8proc.so.2.0.1
|
||||
lib/libutf8proc.so.2 -> libutf8proc.so.2.0.1
|
||||
lib/libutf8proc.so.2.0.1
|
||||
lib/libutf8proc.so -> libutf8proc.so.3.0.0
|
||||
lib/libutf8proc.so.3 -> libutf8proc.so.3.0.0
|
||||
lib/libutf8proc.so.3.0.0
|
||||
|
||||
Reference in New Issue
Block a user