Commit Graph

14 Commits

Author SHA1 Message Date
Mike Glorioso
610730f231
Fix Sign-Conversion warnings in library and test code (#214)
* JuliaStrings#169 turn on sign-conversion warnings

Signed-off-by: Mike Glorioso <mike.glorioso@gmail.com>

* JuliaStrings#169 fix sign-conversion warnings for utf8proc.c

fix sign-converstion warnings for utf8proc_iterate
uc requires at most 21 bits to identify a unicode codepoint, so there is no need for it to be unsigned
multiple locations use, modify, or store uc with a signed value
the only exception is line 137 where uc is compared with an unsigned value

fix sign-converstion warnings for utf8proc_tolower, utf8proc_toupper, utf8proc_totitle
all three methods have sign conversion warnings when calling seqindex_decode_index
seqindex_decode_index uses the passed value as an index to an array utf8proc_sequences
as utf8proc_sequences is hard-coded and smaller than 2^31 - 1 we can safely cast to unsigned

fix sign-converstion warnings for utf8proc_decompose_char
lines with this warning use the defined function utf8proc_decompose_lump
in the function, a hardcoded unsigned value (1<<12) is complemented then cast as a signed value
as the intent is to remove the 12th bit flag from options, a signed value, and explicit cast is safe

fix sign-conversion warnings for utf8proc_map_custom
result is declared as signed, but is only expected to contain values between 0 and 4
sizeof returns an unsigned value. result must be cast to unsigned

Signed-off-by: Mike Glorioso <mike.glorioso@gmail.com>

* JuliaStrings#169 fix sign-conversion warnings for test/*

fix sign-conversion warnings for test/tests.c encode
change type for d to match return value of utf8proc_encode_char

fix sign-conversion warnings for test/graphemetest.c checkline
si, i, and j are unsigned size types, utf8proc_map and utf8proc_iterate accept and return signed size types
utf8proc_map treats negative strlen values as 0. the strlen used by the test must be similarly limited
utf8proc_iterate treats negative strlen values as 4 which will be less than the unsigned size
fix unused-but-set-variable warning by checking the glen value

fix sign-conversion warnings for test/case.c main
the if block ensures that tested codepoint fits in wint_t, but needs to include u and l as well
c, u, and l can be safely cast to wint_t

fix sign-conversion warnings for test/iterate.c
all values used for len are below 8, so an explicit cast is safe
updated types for more portable test code

fix sign-conversion warnings for test/printproperty.c main
change type of c to signed to resolve all sign-converstion warnings.
replace sscanf(... &c) wiht sscanf(... &x) followed by explicit sign converstion

Signed-off-by: Mike Glorioso <mike.glorioso@gmail.com>
2021-01-14 12:59:49 -05:00
Steven G. Johnson
8239639e3f fix NULL args in grapheme_break_stateful 2020-12-15 15:26:56 -05:00
Steven G. Johnson
0643a64479
Fix grapheme breaks on string-initial (#205)
* Fix extended emoji + zwj combo

* Patch initial repeated regional flags and extended+zwj emoj

* Merge conditions for setting breaks bt region

* updated fix

* perform tests for both utf8proc_map and manual calls to utf8proc_grapheme_break_stateful

* consolidate tests

Co-authored-by: Thomas Marks <marksta@umich.edu>
2020-11-23 14:10:29 -05:00
Steven G. Johnson
c6858e955c
use unsigned char more consistently, silence -Wextra compiler warnings (#188) 2020-03-29 10:44:42 -04:00
Steven G. Johnson
11bb3d9dc7 fix grapheme test to work on unmodified data file 2020-03-29 08:53:11 -04:00
Steven G. Johnson
02fb59136d silence warning (closes #184) 2020-03-28 14:00:30 -04:00
Steven G. Johnson
6fff5f32bb
compile more tests on Windows (#183)
* compile more tests on Windows

* still disable charwidth tests

* silence warnings on MSVC about sscanf

* whoops

* silence warning
2020-03-28 10:00:18 -04:00
Steven G. Johnson
5f15b515e1 simplifications 2020-03-28 09:42:29 -04:00
Steven G. Johnson
d588d7097c portable getline replacement (closes #182) 2020-03-28 09:36:58 -04:00
Steven G. Johnson
4603e00cfc
fix CHARBOUND option for non-characters (#149) 2019-03-30 15:22:25 -04:00
Scott Paul Jones
6249e6b8b1 Fix #34 handle 66 Unicode non-characters, also improve performance and surrogate handling 2015-05-29 19:50:03 +02:00
Tony Kelman
0a818c7003 Prefix other C99 typedefs with utf8proc_ 2015-04-06 22:36:33 -07:00
Tony Kelman
ad27722923 Use a new typedef utf8proc_ssize_t to avoid define collisions
with MSVC
2015-04-05 20:06:13 -07:00
Steven G. Johnson
90721f2d39 directory cleanup: move tests and data into subdirectories 2015-03-06 17:36:08 -05:00