14 KiB
utf8proc release history
Version 2.8.0-alpha
- Unicode 15 support ([#247]).
Version 2.7.0
2021-12-16
-
Unicode 14 support (#233).
-
Support
GNUInstallDirsin CMake build (#159). -
cmakebuild now installspkg-configfile (#224). -
Various build and portability improvements.
Version 2.6.1
2020-12-15
- Bugfix in
utf8proc_grapheme_break_statefulforNULLstate argument, which also brokeutf8proc_grapheme_break.
Version 2.6
2020-11-23
-
New
utf8proc_islowerandutf8proc_isupperfunctions (#196). -
Bugfix for manual calls to
grapheme_break_extendedfor initial characters (#205). -
Various build and portability improvements.
Version 2.5
2019-03-27
-
Unicode 13 support (#179).
-
No longer report zero width for category Sk (#167).
-
cmakesupport improvements (#173).
Version 2.4
2019-05-10
-
Unicode 12.1 support (#156).
-
New
-DUTF8PROC_INSTALL=Nooption forcmakebuilds to disable installation (#152). -
Better
makesupport for HP-UX (#154). -
Fixed incorrect
UTF8PROC_VERSION_MINORversion number in header and bumped shared-library version.
Version 2.3
2019-03-30
-
Unicode 12 support (#148).
-
New function
utf8proc_unicode_versionto return the supported Unicode version (#151). -
Simpler character-width computation that no longer uses GNU Unifont metrics: East-Asian wide characters have width 2, and all other printable characters have width 1 (#150).
-
Fix
CHARBOUNDoption forutf8proc_mapto preserve U+FFFE and U+FFFF non-characters (#149).
Version 2.2
2018-07-24
-
utf8proc_NFKC_Casefoldconvenience function forNFKC_Casefoldnormalization (#133). -
UTF8PROC_STRIPNAoption to strip unassigned codepoints (#133). -
Support building static libraries on Windows (callers need to
#define UTF8PROC_STATIC) (#123). -
cmakefix to avoid definingUTF8PROC_EXPORTSglobally (#121). -
toupperof ß (U+00df) now yields ẞ (U+1E9E) (#134), similar to musl; case-folding still yields the standard "ss" mapping. -
utf8proc_charwidthnow returns1for U+00AD (soft hyphen) and for unassigned/PUA codepoints (#135).
Version 2.1.1
2018-04-27
Version 2.1
2016-12-26:
-
New functions
utf8proc_map_customandutf8proc_decompose_customto allow user-supplied transformations of codepoints, in conjunction with other transformations (#89). -
New function
utf8proc_normalize_utf32to apply normalizations directly to UTF-32 data (not just UTF-8) (#88). -
Fixed stack overflow that could occur due to incorrect definition of
UINT16_MAXwith some compilers (#84). -
Fixed conflict with
stdbool.hin Visual Studio (#90). -
Updated font metrics to use Unifont 9.0.04.
Version 2.0.2
2016-07-27:
-
Move
-Wmissing-prototypeswarning flag fromMakefileto.travis.ymlsince MSVC does not understand this flag and it is occasionally useful to build using MSVC through theMakefile(#79). -
Use a different variable name for a nested loop in
bench/bench.c, and declare it in a C89 way rather than inside theforto avoid "error: 'for' loop initial declarations are only allowed in C99 mode" (#80).
Version 2.0.1
2016-07-13:
-
Bug fix in
utf8proc_grapheme_break_stateful(#77). -
Tests now use versioned Unicode files, so they will no longer break when a new version of Unicode is released (#78).
Version 2.0
2016-07-13:
-
Updated for Unicode 9.0 (#70).
-
New
utf8proc_grapheme_break_statefulto handle the complicated grapheme-breaking rules in Unicode 9. The oldutf8proc_grapheme_breakis still provided, but may incorrectly identify grapheme breaks in some Unicode-9 sequences. -
Smaller Unicode tables (#62, #68). This required changes in the
utf8proc_property_tstructure, which breaks backward compatibility if you access thisstructdirectly. The functions in the API remain backward-compatible, however. -
Buffer overrun fix (#66).
Version 1.3.1
2015-11-02:
-
Do not export symbol for internal function
unsafe_encode_char()(#55). -
Install relative symbolic links for shared libraries (#58).
-
Add missing files to
make clean(#58).
Version 1.3
2015-07-06:
-
Updated for Unicode 8.0 (#45).
-
New
utf8proc_tolowerandutf8proc_toupperfunctions, portable replacements fortowlowerandtowupperin the C library (#40). -
Don't treat Unicode "non-characters" as invalid, and improved validity checking in general (#35).
-
Prefix all typedefs with
utf8proc_, e.g.utf8proc_int32_t, to avoid collisions with other libraries (#32). -
Rename
DLLEXPORTtoUTF8PROC_DLLEXPORTto prevent collisions. -
Fix build breakage in the benchmark routines.
-
More fine-grained Makefile variables (
PICFLAGetcetera), so that compilation flags can be selectively overridden, and in particular so thatCFLAGScan be changed without accidentally eliminating necessary flags like-fPICand-std=c99(#43). -
Updated character-width tables based on Unifont 8.0.01 (#51) and the Unicode 8 character categories (#47).
Version 1.2
2015-03-28:
-
Updated for Unicode 7.0 (#6).
-
New function
utf8proc_grapheme_break(c1,c2)that returns whether there is a grapheme break betweenc1andc2(#20). -
New function
utf8proc_charwidth(c)that returns the number of column-positions that should be required forc; essentially a portable replacment forwcwidth(c)(#27). -
New function
utf8proc_category(c)that returns the Unicode category ofc(as one of the constantsUTF8PROC_CATEGORY_xx). Also, a functionutf8proc_category_string(c)that returns the Unicode category ofcas a two-character string. -
cmakescriptCMakeLists.txt, in addition toMakefile, for easier compilation on Windows (#28). -
Various
Makefileimprovements: amake checktarget to perform tests (#13),make install, a rule to automate updating the Unicode tables, etcetera. -
The shared library is now versioned (e.g. has a soname on GNU/Linux) (#24).
-
C++/MSVC compatibility (#17).
-
Most
#definedconstants are nowenums(#29). -
New preprocessor constants
UTF8PROC_VERSION_MAJOR,UTF8PROC_VERSION_MINOR, andUTF8PROC_VERSION_PATCHfor compile-time detection of the API version. -
Doxygen-formatted documentation (#29).
-
The Ruby and PostgreSQL plugins have been removed due to lack of testing (#22).
Version 1.1.6
2013-11-27:
- PostgreSQL 9.2 and 9.3 compatibility (lowercase
clanguage name)
Version 1.1.5
2009-08-20:
- Use
RSTRING_PTR()andRSTRING_LEN()instead ofRSTRING()->ptrandRSTRING()->lenfor ruby1.9 compatibility (and#definethem, if not existent)
2009-10-02:
- Patches for compatibility with Microsoft Visual Studio
2009-10-08:
- Fixes to make utf8proc usable in C++ programs
2009-10-16:
Version 1.1.4
2009-06-14:
- replaced C++ style comments for compatibility reasons
- added typecasts to suppress compiler warnings
- removed redundant source files for ruby-gemfile generation
2009-08-19:
- Changed copyright notice for Public Software Group e. V.
- Minor changes in the
READMEfile
Version 1.1.3
2008-10-04:
- Added a function
utf8proc_versionreturning a string containing the version number of the library. - Included a target
libutf8proc.dylibfor MacOSX.
2009-05-01:
- PostgreSQL 8.3 compatibility (use of
SET_VARSIZEmacro)
Version 1.1.2
2007-07-25:
- Fixed a serious bug in the data file generator, which caused characters being treated incorrectly, when stripping default ignorable characters or calculating grapheme cluster boundaries.
Version 1.1.1
2007-06-25:
- Added a new PostgreSQL function
unistrip, which behaves likeunifold, but also removes all character marks (e.g. accents).
2007-07-22:
- Changed license from BSD to MIT style.
- Added a new function
utf8proc_codepoint_validto the C library. - Changed compiler flags in
Makefilefrom-g -O0to-O2 - The ruby script, which was used to build the
utf8proc_data.cfile, is now included in the distribution.
Version 1.0.3
2007-03-16:
- Fixed a bug in the ruby library, which caused an error, when splitting an
empty string at grapheme cluster boundaries (method
String#utf8chars).
Version 1.0.2
2006-09-21:
- included a check in
Integer#utf8, which raises an exception, if the given code-point is invalid because of being too high (this was missing yet)
2006-12-26:
- added support for PostgreSQL version 8.2
Version 1.0.1
2006-09-20:
- included a gem file for the ruby version of the library
Release of version 1.0.1
Version 1.0
2006-09-17:
- added the
LUMPoption, which lumps certain characters together (seelump.md) (also used for the PostgreSQLunifoldfunction) - added the
STRIPMARKoption, which strips marking characters (or marks of composed characters) - deprecated ruby method
String#char_aryin favour ofString#utf8chars
Version 0.3
2006-07-18:
- changed normalization from NFC to NFKC for postgresql unifold function
2006-08-04:
- added support to mark the beginning of a grapheme cluster with 0xFF (option:
CHARBOUND) - added the ruby method
String#chars, which is returning an array of UTF-8 encoded grapheme clusters - added
NLF2LFtransformation in postgresqlunifoldfunction - added the
DECOMPOSEoption, if you neither useCOMPOSEorDECOMPOSE, no normalization will be performed (different from previous versions) - using integer constants rather than C-strings for character properties
- fixed (hopefully) a problem with the ruby library on Mac OS X, which occurred when compiler optimization was switched on
Version 0.2
2006-06-05:
- changed behaviour of PostgreSQL function to return NULL in case of invalid input, rather than raising an exceptional condition
- improved efficiency of PostgreSQL function (no transformation to C string is done)
2006-06-20:
- added -fpic compiler flag in Makefile
- fixed bug in the C code for the ruby library (usage of non-existent function)
Version 0.1
2006-06-02: initial release of version 0.1