Commit Graph

250 Commits

Author SHA1 Message Date
Steven G. Johnson
d81308faba
uppercase mapping ß (U+00df) to ẞ (U+1E9E) (#134)
* uppercase(0x00df) = 0x1e9e

* tests for titlecase and u+00df uppercase

* NEWS, another test
2018-05-02 14:18:26 -04:00
Steven G. Johnson
8639450134 NEWS for upcoming 2.2 release, version bump 2018-05-02 08:23:40 -04:00
Steven G. Johnson
bdc8b9e4b2
Case folding fixes (#133)
* Fixes allowing for “Full” folding and NFKC_CaseFold compliance.

* Only include C (Common) and F (Full) foldings from CaseFolding.txt. Removed S (Simple) since F & S are specified to be exclusive.
* Extend UTF8PROC_IGNORE to also ignore unassigned codepoints (such as \u2065) which are specified as being discarded by NFKC_CF.

* Document the changes to UTF8PROC_IGNORE in header.

* Add NFKC_CF helper function with documentation.

* restore old IGNORE behavior, add UTF8PROC_STRIPNA, rename to utf8proc_NFKC_Casefold, add a test

* success message

* test that IGNORE does not strip NA

* data update

* NFKC_Casefold shouldn't strip NA
2018-05-02 08:15:02 -04:00
past-due
48949bd3eb Static library support improvements (#123)
* `#define UTF8PROC_STATIC` to disable DLLEXPORT

`#define UTF8PROC_STATIC` to disable DLLEXPORT

* [CMake] Automatically define UTF8PROC_STATIC if BUILD_SHARED_LIBS is off

* [Makefile] Support additional UTF8PROC_DEFINES, which can be used to specify flags like `-DUTF8PROC_STATIC`
2018-04-29 21:37:12 -04:00
past-due
fe3f6bda11 [CMake] Use target_compile_definitions to avoid affecting global definitions (#121) 2018-04-27 12:55:53 -04:00
Steven G. Johnson
b945eddb2c
note Unicode 10 support 2018-04-27 12:51:07 -04:00
Steven G. Johnson
d736adeff1
update to unicode 10 (#132) 2018-04-27 12:50:19 -04:00
Steven G. Johnson
d688ac1226
version bump to 2.1.1 (#131) 2018-04-27 09:58:34 -04:00
Steven G. Johnson
3e6230d9bf fix make clean 2018-04-27 09:30:37 -04:00
Steven G. Johnson
ba042cf728 missing return code, success message in test/misc.c 2018-04-27 09:10:38 -04:00
Steven G. Johnson
d050c4636a make internal function static 2018-04-27 08:57:54 -04:00
Steven G. Johnson
53d7968055 added test for #128 2018-04-27 08:46:44 -04:00
Benito van der Zander
acc204f1f1 possible fix for #128 (#129)
Does this help? I do not really remember what I wrote back then
2018-04-27 08:06:14 -04:00
Ryan Schmidt
6a20831a07 Use LDFLAGS when building libutf8proc.dylib (#125) 2018-04-18 07:50:15 -07:00
Branko Čibej
3a10df6013 Fix declaration-after-statement warning when compiling in strict C90 mode. (#113) 2017-09-21 12:27:24 -04:00
Christopher Baker
2a2f97e193 Update documentation to reflect Unicode 9.0.0. (#107)
This makes the inline documentation match the README.
2017-06-08 09:29:54 -07:00
Paul Smith
95fc75b839 Ensure generated const data tables are hidden via "static" (#100) 2017-02-19 17:33:25 -05:00
Jameson Nash
91b91fe033 don't set MAKE variable in Makefile (#99)
fix #95
2017-02-18 10:14:45 -05:00
Árpád Goretity 
31a8788886 removed inclusion of non-portable header file (#94) 2017-01-14 08:12:29 -05:00
Michael Hatherly
eab97d16fb Don't use cached version of UnicodeData.txt (#92)
Ref: https://github.com/JuliaLang/julia/pull/19725, UnicodeData.txt is
now being cached in JuliaLang/julia's build.
2017-01-03 16:44:23 -08:00
Steven G. Johnson
a7f3a3212a fix typo in NEWS date 2016-12-26 16:01:42 -05:00
Steven G. Johnson
40e605959e version 2.1 release 2016-12-26 15:52:48 -05:00
Steven G. Johnson
6271fb97c0 update NEWS [ci skip] 2016-12-11 16:42:24 -05:00
Steven G. Johnson
15e1819cdd update to unifont 9.0.04 2016-12-11 16:35:27 -05:00
Steven G. Johnson
4ac3154acc whoops 2016-12-11 16:18:52 -05:00
Steven G. Johnson
78f336addd use ptrdiff_t rather than ssize_t, as ssize_t is non-standard (it is POSIX, not C) 2016-12-11 16:17:11 -05:00
Steven G. Johnson
59334e4499 use stdbool.h and inttypes.h in MSVC 2013 and later, and use more C99-compatible definitions of false and true earlier (fix #90) 2016-12-11 07:16:48 -05:00
Steven G. Johnson
e46d213241 update .gitignore for custom test 2016-11-30 10:46:01 -05:00
Steven G. Johnson
b4621f43c3 new utf8proc_map_custom for hooking in user-defined custom mappings (#89)
* new utf8proc_map_custom for hooking in user-defined custom mappings

* whoops, add test program

* NEWS, version bump for 2.1

* change test functions to static so that gcc doesn't complain about missing prototypes
2016-11-30 10:40:26 -05:00
Steven G. Johnson
8da37e2892 silence MSVC warning about conversion to uint8 (fix #86) 2016-11-30 10:09:18 -05:00
Steven G. Johnson
f5567f306a typo in docstrings 2016-11-29 13:49:03 -05:00
Michael Drake
70bbed8626 Tlsa/ucs4 normalize (#88)
* Split codepoint sequence normalisation out into separate function.

This creates utf8proc_normalize_utf32() which takes and returns
a UTF-32 string, applying the following options:

- UTF8PROC_NLF2LS
- UTF8PROC_NLF2PS
- UTF8PROC_NLF2LF
- UTF8PROC_STRIPCC
- UTF8PROC_COMPOSE
- UTF8PROC_STABLE

The utf8proc_reencode() function has been updated to call the
new utf8proc_normalize_utf32().

* Update code documentation: utf8proc_reencode handles UTF8PROC_CHARBOUND.
2016-11-21 09:22:39 -05:00
Jakub Vít
caef918abd Change definition of UINT16_MAX macro (#84)
Change UINT16_MAX from `~(utf8proc_uint16_t)0` to fixed value `65535U` to prevent weird behaviour in complex expressions.
2016-09-04 14:44:38 -04:00
Steven G. Johnson
ce11639220 add missing links 2016-07-27 08:04:38 -04:00
Steven G. Johnson
e3a5ed7b8b date fix in NEWS 2016-07-27 07:59:43 -04:00
Tony Kelman
8e3174f334 NEWS and version numbers for 2.0.2 (#81)
* Add NEWS.md items for #79 and #80

* Prepare version numbers for 2.0.2

* Also update API version to 2.0.2
2016-07-27 07:58:49 -04:00
Tony Kelman
0bf1973a0f use a different variable name for nested loop in bench.c (#80)
and declare it ahead of time to avoid "error: 'for' loop initial declarations are only allowed in C99 mode"
2016-07-26 17:54:17 -04:00
Tony Kelman
47cbf7d96d Move -Wmissing-prototypes from Makefile to .travis.yml (#79)
since MSVC doesn't understand this flag, and the current
mechanism for building Julia with MSVC goes through the makefile
2016-07-16 11:16:03 +01:00
petercolberg
a1fe9955bb Convert compiler warnings to errors for Travis builds (#73) 2016-07-13 12:58:28 -04:00
Steven G. Johnson
c3d401cf06 added NEWS for #78 2016-07-13 12:42:07 -04:00
petercolberg
11b84e2de1 Use versioned Unicode data URLs (#78)
This ensures the tests keep working when a new Unicode version is released.
2016-07-13 12:40:59 -04:00
Steven G. Johnson
f0bf106569 NEWS and version bump for 2.0.1 release, to come out shortly 2016-07-13 12:39:05 -04:00
Keno Fischer
289ce5e041 Fix incorrect use of lbc instead of lbc_override (#77) 2016-07-13 12:33:50 -04:00
Tony Kelman
e0f0899eaa add appveyor badge to readme
[ci skip]
2016-07-13 09:18:40 -07:00
Steven G. Johnson
7fbb7d7dd8 NEWS update 2016-07-13 12:02:47 -04:00
Steven G. Johnson
cb2a3e464d the ABI version was already bumped in #62, does not need to be bumped again in #70 2016-07-13 11:00:17 -04:00
Steven G. Johnson
39ab2ff273 NEWS for 2.0 2016-07-13 10:57:37 -04:00
Keno Fischer
c0a1ff81fc Walk back ABI breaking changes (#76) 2016-07-13 10:41:13 -04:00
Steven G. Johnson
c02ebd5a83 update to Unifont 9 (for Unicode 9 charwidths) (#75) 2016-07-12 16:30:05 -04:00
Benito van der Zander
eeebf70bcf Smaller tables (#68)
* convert sequences to utf-16 (saves 25kb)

* store sequence length in properties instead using -1 termination (saves 10kb)

* cache index for slightly faster data creation

* store lower/upper/title mapping in sequence array (saves 25kb). Add utf8proc_totitle, as title_mapping cannot be used to get the title codepoint anymore. Rename xxx_mapping to xxx_seqindex, so programs assuming a value with the old meaning fail at compile time

* change combination array data type to uint16 (saves 40kb)

* merge 1st and 2nd comb index (saves 50kb)

* kill empty prefix/suffix in combination array (saves 50kb)

* there was no need to have a separate combination start array, it can be merged in a single array

* some fixes

* mark the table as const again

* and regen
2016-07-12 11:51:50 -04:00