* JuliaStrings#169 turn on sign-conversion warnings Signed-off-by: Mike Glorioso <mike.glorioso@gmail.com> * JuliaStrings#169 fix sign-conversion warnings for utf8proc.c fix sign-converstion warnings for utf8proc_iterate uc requires at most 21 bits to identify a unicode codepoint, so there is no need for it to be unsigned multiple locations use, modify, or store uc with a signed value the only exception is line 137 where uc is compared with an unsigned value fix sign-converstion warnings for utf8proc_tolower, utf8proc_toupper, utf8proc_totitle all three methods have sign conversion warnings when calling seqindex_decode_index seqindex_decode_index uses the passed value as an index to an array utf8proc_sequences as utf8proc_sequences is hard-coded and smaller than 2^31 - 1 we can safely cast to unsigned fix sign-converstion warnings for utf8proc_decompose_char lines with this warning use the defined function utf8proc_decompose_lump in the function, a hardcoded unsigned value (1<<12) is complemented then cast as a signed value as the intent is to remove the 12th bit flag from options, a signed value, and explicit cast is safe fix sign-conversion warnings for utf8proc_map_custom result is declared as signed, but is only expected to contain values between 0 and 4 sizeof returns an unsigned value. result must be cast to unsigned Signed-off-by: Mike Glorioso <mike.glorioso@gmail.com> * JuliaStrings#169 fix sign-conversion warnings for test/* fix sign-conversion warnings for test/tests.c encode change type for d to match return value of utf8proc_encode_char fix sign-conversion warnings for test/graphemetest.c checkline si, i, and j are unsigned size types, utf8proc_map and utf8proc_iterate accept and return signed size types utf8proc_map treats negative strlen values as 0. the strlen used by the test must be similarly limited utf8proc_iterate treats negative strlen values as 4 which will be less than the unsigned size fix unused-but-set-variable warning by checking the glen value fix sign-conversion warnings for test/case.c main the if block ensures that tested codepoint fits in wint_t, but needs to include u and l as well c, u, and l can be safely cast to wint_t fix sign-conversion warnings for test/iterate.c all values used for len are below 8, so an explicit cast is safe updated types for more portable test code fix sign-conversion warnings for test/printproperty.c main change type of c to signed to resolve all sign-converstion warnings. replace sscanf(... &c) wiht sscanf(... &x) followed by explicit sign converstion Signed-off-by: Mike Glorioso <mike.glorioso@gmail.com> |
||
|---|---|---|
| bench | ||
| data | ||
| test | ||
| .gitignore | ||
| .travis.yml | ||
| appveyor.yml | ||
| CMakeLists.txt | ||
| Doxyfile | ||
| libutf8proc.pc.in | ||
| LICENSE.md | ||
| lump.md | ||
| Makefile | ||
| MANIFEST | ||
| NEWS.md | ||
| README.md | ||
| utf8proc_data.c | ||
| utf8proc.c | ||
| utf8proc.h | ||
| utils.cmake | ||
utf8proc
utf8proc is a small, clean C library that provides Unicode normalization, case-folding, and other operations for data in the UTF-8 encoding. It was initially developed by Jan Behrens and the rest of the Public Software Group, who deserve nearly all of the credit for this package. With the blessing of the Public Software Group, the Julia developers have taken over development of utf8proc, since the original developers have moved to other projects.
(utf8proc is used for basic Unicode support in the Julia language, and the Julia developers became involved because they wanted to add Unicode 7 support and other features.)
(The original utf8proc package also includes Ruby and PostgreSQL plug-ins. We removed those from utf8proc in order to focus exclusively on the C library for the time being, but plan to add them back in or release them as separate packages.)
The utf8proc package is licensed under the
free/open-source MIT "expat"
license (plus certain Unicode
data governed by the similarly permissive Unicode data
license); please see
the included LICENSE.md file for more detailed information.
Quick Start
Typical users should download a utf8proc release rather than cloning directly from github.
For compilation of the C library, run make. You can also install the library and header file with make install (by default into /usr/local/lib and /usr/local/bin, but this can be changed by make prefix=/some/dir). make check runs some tests, and make clean deletes all of the generated files.
Alternatively, you can compile with cmake, e.g. by
mkdir build
cd build
cmake ..
make
Using other compilers
The included Makefile supports GNU/Linux flavors and MacOS with gcc-like compilers; Windows users will typically use cmake.
For other Unix-like systems and other compilers, you may need to pass modified settings to make in order to use the correct compilation flags for building shared libraries on your system.
For HP-UX with HP's aCC compiler and GNU Make (installed as gmake), you can compile with
gmake CC=/opt/aCC/bin/aCC CFLAGS="+O2" PICFLAG="+z" C99FLAG="-Ae" WCFLAGS="+w" LDFLAG_SHARED="-b" SOFLAG="-Wl,+h"
To run gmake install you will need GNU coreutils for the install command, and you may want to pass prefix=/opt libdir=/opt/lib/hpux32 or similar to change the installation location.
General Information
The C library is found in this directory after successful compilation
and is named libutf8proc.a (for the static library) and
libutf8proc.so (for the dynamic library).
The Unicode version supported is 13.0.0.
For Unicode normalizations, the following options are used:
- Normalization Form C:
STABLE,COMPOSE - Normalization Form D:
STABLE,DECOMPOSE - Normalization Form KC:
STABLE,COMPOSE,COMPAT - Normalization Form KD:
STABLE,DECOMPOSE,COMPAT
C Library
The documentation for the C library is found in the utf8proc.h header file.
utf8proc_map is function you will most likely be using for mapping UTF-8
strings, unless you want to allocate memory yourself.
To Do
See the Github issues list.
Contact
Bug reports, feature requests, and other queries can be filed at the utf8proc issues page on Github.
See also
An independent Lua translation of this library, lua-mojibake, is also available.
