Smaller tables (#68)

* convert sequences to utf-16 (saves 25kb) * store sequence length in properties instead using -1 termination (saves 10kb) * cache index for slightly faster data creation * store lower/upper/title mapping in sequence array (saves 25kb). Add utf8proc_totitle, as title_mapping cannot be used to get the title codepoint anymore. Rename xxx_mapping to xxx_seqindex, so programs assuming a value with the old meaning fail at compile time * change combination array data type to uint16 (saves 40kb) * merge 1st and 2nd comb index (saves 50kb) * kill empty prefix/suffix in combination array (saves 50kb) * there was no need to have a separate combination start array, it can be merged in a single array * some fixes * mark the table as const again * and regen
2016-07-12 17:51:50 +02:00
parent 9a0b87b57e
commit eeebf70bcf
5 changed files with 9177 additions and 11760 deletions
--- a/test/printproperty.c
+++ b/test/printproperty.c
@@ -22,8 +22,7 @@ int main(int argc, char **argv)
                 "  uppercase_mapping = %x\n"
                 "  lowercase_mapping = %x\n"
                 "  titlecase_mapping = %x\n"
-                 "  comb1st_index = %d\n"
-                 "  comb2nd_index = %d\n"
+                 "  comb_index = %d\n"
                 "  bidi_mirrored = %d\n"
                 "  comp_exclusion = %d\n"
                 "  ignorable = %d\n"
@@ -35,11 +34,10 @@ int main(int argc, char **argv)
                 p->combining_class,
                 p->bidi_class,
                 p->decomp_type,
-                 p->uppercase_mapping,
-                 p->lowercase_mapping,
-                 p->titlecase_mapping,
-                 p->comb1st_index,
-                 p->comb2nd_index,
+                 utf8proc_toupper(c),
+                 utf8proc_tolower(c),
+                 utf8proc_totitle(c),
+                 p->comb_index,
                 p->bidi_mirrored,
                 p->comp_exclusion,
                 p->ignorable,