Commit Graph

13 Commits

Author SHA1 Message Date
Keno Fischer
41c6b23aab Unicode 9 updates (#70)
* Updates for Unicode 9.0.0 TR29 Changes

- New rules GB10/(12/13) are used to combine emoji-zwj sequences/
  (force grapheme breaks every two RI codepoints). Unfortunately this
  breaks statelessness of grapheme-boundary determination. Deal with
  this by ignoring the problem in utf8proc_grapheme_break, and by
  hacking in a special case in decompose

- ZWJ moved to its own boundclass, update what is now GB9 accordingly.

- Add comments to indicate which rule a given case implements

- The Number of bound classes Now exceeds 4 bits, expand to 8 and
  reorganize fields

* Import Unicode 9 data

* Update Grapheme break API to expose state override

* Bump MAJOR version
2016-06-28 16:04:25 -04:00
Michaël Meyer
26436c9775 Reduce the size of the binary.
Use integers instead of pointers in Unicode tables. Saves 226 kb / 716 kb in the
compiled library.
2015-12-09 19:55:48 +01:00
Peter Colberg
9b7184ec56 Update Unicode data
Fixes Travis builds on Ubuntu 12.04 LTS with Ruby 1.9.3-p551.
2015-10-29 19:41:16 -04:00
Jiahao Chen
cfa7c96003 Update Unicode data 2015-06-29 16:43:07 -04:00
Jiahao Chen (陈家豪)
1cc58b2bc9 Updated Unicode 8 data - now sorted internally by data generator 2015-06-26 12:12:13 -04:00
Jiahao Chen
b14ca2be57 Update Unicode data 2015-06-26 12:01:27 -04:00
Steven G. Johnson
6a7f92da64 fix #46 (make sure symbol-like codepoints have nonzero width even if they aren't in Unifont) 2015-06-24 14:07:15 -04:00
Jiahao Chen
92bc19fbe0 Updated data file to Unicode 8.0.0 2015-06-23 16:18:35 -04:00
Tony Kelman
0a818c7003 Prefix other C99 typedefs with utf8proc_ 2015-04-06 22:36:33 -07:00
Steven G. Johnson
a4c84d2063 fix #2: add charwidth function 2015-03-12 12:10:19 -04:00
Steven G. Johnson
397a1eabea update graphemes for Unicode 7, add utf8proc_grapheme_break function 2014-12-12 16:30:31 -05:00
Jiahao Chen
b81326e82f Update utf8proc_data.c (generated by data_generator.rb) 2014-07-18 10:46:11 -04:00
Steven G. Johnson
ab9520d188 import of utf8proc-v1.1.6 2014-07-15 15:29:52 -04:00