Commit Graph

274 Commits

Author SHA1 Message Date
Randy
c17ea5dfef
OSS-Fuzz initial integration (#216)
* add fuzz target

* update fuzzer

* add fuzzer to build with basic entry point

* add build script

* cleanup

* build fuzz target using cmake in oss-fuzz env

* ossfuzz.sh add newline

* update build
2021-01-29 13:54:58 -05:00
Mike Glorioso
610730f231
Fix Sign-Conversion warnings in library and test code (#214)
* JuliaStrings#169 turn on sign-conversion warnings

Signed-off-by: Mike Glorioso <mike.glorioso@gmail.com>

* JuliaStrings#169 fix sign-conversion warnings for utf8proc.c

fix sign-converstion warnings for utf8proc_iterate
uc requires at most 21 bits to identify a unicode codepoint, so there is no need for it to be unsigned
multiple locations use, modify, or store uc with a signed value
the only exception is line 137 where uc is compared with an unsigned value

fix sign-converstion warnings for utf8proc_tolower, utf8proc_toupper, utf8proc_totitle
all three methods have sign conversion warnings when calling seqindex_decode_index
seqindex_decode_index uses the passed value as an index to an array utf8proc_sequences
as utf8proc_sequences is hard-coded and smaller than 2^31 - 1 we can safely cast to unsigned

fix sign-converstion warnings for utf8proc_decompose_char
lines with this warning use the defined function utf8proc_decompose_lump
in the function, a hardcoded unsigned value (1<<12) is complemented then cast as a signed value
as the intent is to remove the 12th bit flag from options, a signed value, and explicit cast is safe

fix sign-conversion warnings for utf8proc_map_custom
result is declared as signed, but is only expected to contain values between 0 and 4
sizeof returns an unsigned value. result must be cast to unsigned

Signed-off-by: Mike Glorioso <mike.glorioso@gmail.com>

* JuliaStrings#169 fix sign-conversion warnings for test/*

fix sign-conversion warnings for test/tests.c encode
change type for d to match return value of utf8proc_encode_char

fix sign-conversion warnings for test/graphemetest.c checkline
si, i, and j are unsigned size types, utf8proc_map and utf8proc_iterate accept and return signed size types
utf8proc_map treats negative strlen values as 0. the strlen used by the test must be similarly limited
utf8proc_iterate treats negative strlen values as 4 which will be less than the unsigned size
fix unused-but-set-variable warning by checking the glen value

fix sign-conversion warnings for test/case.c main
the if block ensures that tested codepoint fits in wint_t, but needs to include u and l as well
c, u, and l can be safely cast to wint_t

fix sign-conversion warnings for test/iterate.c
all values used for len are below 8, so an explicit cast is safe
updated types for more portable test code

fix sign-conversion warnings for test/printproperty.c main
change type of c to signed to resolve all sign-converstion warnings.
replace sscanf(... &c) wiht sscanf(... &x) followed by explicit sign converstion

Signed-off-by: Mike Glorioso <mike.glorioso@gmail.com>
2021-01-14 12:59:49 -05:00
Steven G. Johnson
0520d6f724 download test data to build directory (fixes #212) 2020-12-19 13:08:34 -05:00
Steven G. Johnson
f1f51b8242
ensure ruby is in UTF-8 mode (#209)
* ensure ruby is in UTF-8 mode

* Revert "ensure ruby is in UTF-8 mode"

This reverts commit 587b7b6b7215f91b1ae52aefc82d359f2f378a61.

* ensure Ruby reads files in UTF-8 encoding
2020-12-17 18:36:28 -05:00
Steven G. Johnson
3203baa737 fix manifest 2020-12-15 16:36:45 -05:00
Steven G. Johnson
28416640ed 2.6.1 version bump 2020-12-15 15:29:32 -05:00
Steven G. Johnson
8239639e3f fix NULL args in grapheme_break_stateful 2020-12-15 15:26:56 -05:00
Steven G. Johnson
df2997a300 update doxygen config with doxygen -u 2020-11-23 14:21:26 -05:00
Steven G. Johnson
cea3cd158f bump to version 2.6 2020-11-23 14:18:43 -05:00
Steven G. Johnson
0643a64479
Fix grapheme breaks on string-initial (#205)
* Fix extended emoji + zwj combo

* Patch initial repeated regional flags and extended+zwj emoj

* Merge conditions for setting breaks bt region

* updated fix

* perform tests for both utf8proc_map and manual calls to utf8proc_grapheme_break_stateful

* consolidate tests

Co-authored-by: Thomas Marks <marksta@umich.edu>
2020-11-23 14:10:29 -05:00
Tim Gates
6f7d73071a
docs: fix simple typo, encounted -> encountered (#201)
There is a small typo in utf8proc.h.

Should read `encountered` rather than `encounted`.
2020-10-09 08:30:50 -04:00
Steven G. Johnson
5622a0a51b
add islower/isupper functions (#196)
* add islower/isupper functions

* added test

* more tests + bugfix

* Makefile fix

* rm iscase test on make clean
2020-08-25 16:42:59 -04:00
xkszltl
08f9999a06
Switch to HTTPS for referencing www.unicode.org. (#193)
Resolve https://github.com/JuliaStrings/utf8proc/issues/192
2020-05-25 10:20:08 -04:00
Stefan Floeren
b5211c88af
Unify include file handling (#190)
The cmake file expects the parent folder to be named "utf8proc",
otherwise the target_include_directories won't work, as it references
an unknown path.

This deviates from the install targets (both cmake and makefile) in
putting the include file into a subfolder in contrast to the top level
folder. This also prevents using the library with the recent cmake
addition of FetchContent.

This change unifies the include file handling by using the local path
for cmake as well.

This might break existing uses. As a workaround, we could add a dummy
include file in the old location (new utf8proc subfolder). I'm not sure
if that is necessary.

Co-authored-by: Stefan Floeren <stefan-floeren@users.noreply.github.com>
2020-04-13 10:59:30 -04:00
Andreas-Schniertshauer
e51f416e0c
Fix memory leaks in tests case.c and misc.c (#189)
* Add: tests to CMakeLists.txt

* Disable compilation of charwidth, graphemetest and normtest because of missing getline

* Refactoring: UTF8PROC_ENABLE_TESTING default Off, move tests that don't compile on windows to NOT MSVC section, add testing to appveyor.yml

* Add: testing to travis

* Changed: flag to WIN32 because MinGW has the same problem as MSVC

* Commented out graphemetest and normtest because they fail.

* Re-added: graphemetest and normtest added missing data to the path of the text files.

* Fix: last commit was party wrong normtest failed.

* * Commented out graphemetest and normtest because they fail, because in CMakeLists is missing building of data.

* Add: mingw_static, mingw_shared, msvc_shared, msvc_static to ignore list

* Add: prefix utf8proc. to tests

* Fix: memory leaks in tests case.c and misc.c forgot to call free after calling utf8proc_NFKC_Casefold

Co-authored-by: Andreas-Schniertshauer <Andreas-Schniertshauer@users.noreply.github.com>
2020-03-30 07:51:44 -04:00
Steven G. Johnson
ffba678bf4
Revert "disable tests under mingw" (#187)
This reverts commit 7e834d7702.
2020-03-29 10:48:42 -04:00
Steven G. Johnson
c6858e955c
use unsigned char more consistently, silence -Wextra compiler warnings (#188) 2020-03-29 10:44:42 -04:00
Steven G. Johnson
243875b456 fixes 2020-03-29 09:35:32 -04:00
Steven G. Johnson
f645f2a700 add build to gitignore, make paths absolute (closes #185) 2020-03-29 09:01:04 -04:00
Steven G. Johnson
11bb3d9dc7 fix grapheme test to work on unmodified data file 2020-03-29 08:53:11 -04:00
Steven G. Johnson
7e834d7702 disable tests under mingw 2020-03-28 21:25:42 -04:00
Andreas-Schniertshauer
98142acff9
Download data and execute commented out tests (#178)
* Add: tests to CMakeLists.txt

* Disable compilation of charwidth, graphemetest and normtest because of missing getline

* Refactoring: UTF8PROC_ENABLE_TESTING default Off, move tests that don't compile on windows to NOT MSVC section, add testing to appveyor.yml

* Add: testing to travis

* Changed: flag to WIN32 because MinGW has the same problem as MSVC

* Commented out graphemetest and normtest because they fail.

* Re-added: graphemetest and normtest added missing data to the path of the text files.

* Fix: last commit was party wrong normtest failed.

* * Commented out graphemetest and normtest because they fail, because in CMakeLists is missing building of data.

* Add: mingw_static, mingw_shared, msvc_shared, msvc_static to ignore list

* Add: downloading data and executing enabled tests that depend on the downloaded data.

* Fix: windows line endings CRLF replaced with linux LF

* Refactoring: (major) set UNICODE_VERSION to 13.0.0, replace curl with file DOWNLOAD, removed downloading unnecessary files, enabled normtest.

* Fix: woodhead error in revision adeac82ec9941667e3c3ad7f50769793547218c3 readded calling execute_process to strip GraphemeBreakTest.txt file

* Add: removing no more used file data/GraphemeBreakTestOrg.txt after stripping.

* Add: testing folder to ignore list

* Add: enabled graphemetest

* Update .gitignore

Co-authored-by: Andreas-Schniertshauer <Andreas-Schniertshauer@users.noreply.github.com>
Co-authored-by: Steven G. Johnson <stevenj@mit.edu>
2020-03-28 17:16:35 -04:00
Steven G. Johnson
14c61c9683 Merge branch 'master' of https://github.com/JuliaLang/utf8proc 2020-03-28 14:00:43 -04:00
Steven G. Johnson
02fb59136d silence warning (closes #184) 2020-03-28 14:00:30 -04:00
Andreas-Schniertshauer
864f7f7b46
Tests with prefix utf8proc. (#177)
* Add: tests to CMakeLists.txt

* Disable compilation of charwidth, graphemetest and normtest because of missing getline

* Refactoring: UTF8PROC_ENABLE_TESTING default Off, move tests that don't compile on windows to NOT MSVC section, add testing to appveyor.yml

* Add: testing to travis

* Changed: flag to WIN32 because MinGW has the same problem as MSVC

* Commented out graphemetest and normtest because they fail.

* Re-added: graphemetest and normtest added missing data to the path of the text files.

* Fix: last commit was party wrong normtest failed.

* * Commented out graphemetest and normtest because they fail, because in CMakeLists is missing building of data.

* Add: mingw_static, mingw_shared, msvc_shared, msvc_static to ignore list

* Add: prefix utf8proc. to tests

* Add: prefix utf8proc. to tests

Co-authored-by: Andreas-Schniertshauer <Andreas-Schniertshauer@users.noreply.github.com>
Co-authored-by: Steven G. Johnson <stevenj@mit.edu>
2020-03-28 10:31:27 -04:00
Steven G. Johnson
6fff5f32bb
compile more tests on Windows (#183)
* compile more tests on Windows

* still disable charwidth tests

* silence warnings on MSVC about sscanf

* whoops

* silence warning
2020-03-28 10:00:18 -04:00
Steven G. Johnson
5f15b515e1 simplifications 2020-03-28 09:42:29 -04:00
Steven G. Johnson
d588d7097c portable getline replacement (closes #182) 2020-03-28 09:36:58 -04:00
Steven G. Johnson
0890a538bf new emoji-data.txt location (fixes #181) 2020-03-27 20:36:18 -04:00
Steven G. Johnson
0ff48bfbfd update 2020-03-27 18:38:44 -04:00
Steven G. Johnson
1ee551c85b whoops, generated from old tables 2020-03-27 18:35:20 -04:00
Steven G. Johnson
189dc0e981 link fixes 2020-03-27 17:32:42 -04:00
Steven G. Johnson
2bb7d884b5 version bump to 2.5 2020-03-27 17:22:21 -04:00
Steven G. Johnson
b48f5d074f
Unicode 13 support (#179)
* exclude Sk from zero-width chars (closes #167)

* update for Unicode 13
2020-03-27 17:06:06 -04:00
Andreas-Schniertshauer
47edf655b3
Add: tests to CMakeLists.txt (#173)
* Add: tests to CMakeLists.txt

* Disable compilation of charwidth, graphemetest and normtest because of missing getline

* Refactoring: UTF8PROC_ENABLE_TESTING default Off, move tests that don't compile on windows to NOT MSVC section, add testing to appveyor.yml

* Add: testing to travis

* Changed: flag to WIN32 because MinGW has the same problem as MSVC

* Commented out graphemetest and normtest because they fail.

* Re-added: graphemetest and normtest added missing data to the path of the text files.

* Fix: last commit was party wrong normtest failed.

* * Commented out graphemetest and normtest because they fail, because in CMakeLists is missing building of data.
2020-02-19 14:25:19 -05:00
Stefan Weil
20672dba69 Fix some typos (found by codespell) (#160)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2019-07-11 14:04:34 -04:00
Steven G. Johnson
0dc0fcf8db slight clarification 2019-05-14 11:00:08 -04:00
Steven G. Johnson
e6fba4aa8c update header file comments (closes #157) 2019-05-14 10:53:55 -04:00
Steven G. Johnson
5c632c5742 NEWS for 2.4, updated version numbers (which I forgot in 2.3, grrr) 2019-05-10 21:24:14 -04:00
Steven G. Johnson
9bb261f66b ignore vscode 2019-05-10 21:13:32 -04:00
GOTOH Shunsuke
7b28b9e60c update for unicode 12.1 (#156) 2019-05-10 21:12:45 -04:00
Steven G. Johnson
5d902fa1aa
typo 2019-05-08 11:24:51 -04:00
Steven G. Johnson
46b87f061e
more info 2019-05-08 11:24:27 -04:00
Steven G. Johnson
d416617270
note official releases 2019-05-08 11:22:37 -04:00
Steven G. Johnson
1fcb211035
more compile info 2019-05-08 11:20:25 -04:00
Michael Osipov
229fb8483e Document HP-UX build support (#155) 2019-05-07 20:02:19 -04:00
Michael Osipov
e1f8c728bb Improve portability of Make (#154)
Several options passed to $(CC) are not portable, e.g., for HP aCC.
Move them to variables.
2019-05-07 20:00:21 -04:00
past-due
416749803b [CMake] Add UTF8PROC_NO_INSTALL option (#152)
* [CMake] Add UTF8PROC_NO_INSTALL option

* change to UTF8PROC_INSTALL
2019-04-17 14:49:04 -04:00
Steven G. Johnson
e57cb43f2c
copyright year update 2019-04-09 15:10:20 -04:00
Steven G. Johnson
e1b05f7be3
fontforge is no longer needed 2019-04-09 15:09:57 -04:00