markdown fixes, prettified NEWS
This commit is contained in:
parent
0d7224a6d8
commit
e41bd981cb
131
Changelog
131
Changelog
@ -1,131 +0,0 @@
|
|||||||
Changelog
|
|
||||||
|
|
||||||
2006-06-02:
|
|
||||||
- initial release of version 0.1
|
|
||||||
|
|
||||||
2006-06-05:
|
|
||||||
- changed behaviour of PostgreSQL function to return NULL in case of
|
|
||||||
invalid input, rather than raising an exceptional condition
|
|
||||||
- improved efficiency of PostgreSQL function (no transformation to C string
|
|
||||||
is done)
|
|
||||||
|
|
||||||
2006-06-20:
|
|
||||||
- added -fpic compiler flag in Makefile
|
|
||||||
- fixed bug in the C code for the ruby library (usage of non-existent
|
|
||||||
function)
|
|
||||||
|
|
||||||
Release of version 0.2
|
|
||||||
|
|
||||||
|
|
||||||
2006-07-18:
|
|
||||||
- changed normalization from NFC to NFKC for postgresql unifold function
|
|
||||||
|
|
||||||
2006-08-04:
|
|
||||||
- added support to mark the beginning of a grapheme cluster with 0xFF
|
|
||||||
(option: CHARBOUND)
|
|
||||||
- added the ruby method String#chars, which is returning an array of UTF-8
|
|
||||||
encoded grapheme clusters
|
|
||||||
- added NLF2LF transformation in postgresql unifold function
|
|
||||||
- added the DECOMPOSE option, if you neither use COMPOSE or DECOMPOSE, no
|
|
||||||
normalization will be performed (different from previous versions)
|
|
||||||
- using integer constants rather than C-strings for character properties
|
|
||||||
- fixed (hopefully) a problem with the ruby library on Mac OS X, which
|
|
||||||
occured when compiler optimization was switched on
|
|
||||||
|
|
||||||
Release of version 0.3
|
|
||||||
|
|
||||||
|
|
||||||
2006-09-17:
|
|
||||||
- added the LUMP option, which lumps certain characters together
|
|
||||||
(see lump.txt) (also used for the PostgreSQL "unifold" function)
|
|
||||||
- added the STRIPMARK option, which strips marking characters
|
|
||||||
(or marks of composed characters)
|
|
||||||
- deprecated ruby method String#char_ary in favour of String#utf8chars
|
|
||||||
|
|
||||||
Release of version 1.0
|
|
||||||
|
|
||||||
|
|
||||||
2006-09-20:
|
|
||||||
- included a gem file for the ruby version of the library
|
|
||||||
|
|
||||||
Release of version 1.0.1
|
|
||||||
|
|
||||||
|
|
||||||
2006-09-21:
|
|
||||||
- included a check in Integer#utf8, which raises an exception, if the given
|
|
||||||
code-point is invalid because of being too high (this was missing yet)
|
|
||||||
|
|
||||||
2006-12-26:
|
|
||||||
- added support for PostgreSQL version 8.2
|
|
||||||
|
|
||||||
Release of version 1.0.2
|
|
||||||
|
|
||||||
|
|
||||||
2007-03-16:
|
|
||||||
- Fixed a bug in the ruby library, which caused an error, when splitting an
|
|
||||||
empty string at grapheme cluster boundaries (method String#utf8chars).
|
|
||||||
|
|
||||||
Release of version 1.0.3
|
|
||||||
|
|
||||||
|
|
||||||
2007-06-25:
|
|
||||||
- Added a new PostgreSQL function 'unistrip', which behaves like 'unifold',
|
|
||||||
but also removes all character marks (e.g. accents).
|
|
||||||
|
|
||||||
2007-07-22:
|
|
||||||
- Changed license from BSD to MIT style.
|
|
||||||
- Added a new function 'utf8proc_codepoint_valid' to the C library.
|
|
||||||
- Changed compiler flags in Makefile from -g -O0 to -O2
|
|
||||||
- The ruby script, which was used to build the utf8proc_data.c file, is now
|
|
||||||
included in the distribution.
|
|
||||||
|
|
||||||
Release of version 1.1.1
|
|
||||||
|
|
||||||
|
|
||||||
2007-07-25:
|
|
||||||
- Fixed a serious bug in the data file generator, which caused characters
|
|
||||||
being treated incorrectly, when stripping default ignorable characters or
|
|
||||||
calculating grapheme cluster boundaries.
|
|
||||||
|
|
||||||
Release of version 1.1.2
|
|
||||||
|
|
||||||
|
|
||||||
2008-10-04:
|
|
||||||
- Added a function utf8proc_version returning a string containing the version
|
|
||||||
number of the library.
|
|
||||||
- Included a target libutf8proc.dylib for MacOSX.
|
|
||||||
|
|
||||||
2009-05-01:
|
|
||||||
- PostgreSQL 8.3 compatibility (use of SET_VARSIZE macro)
|
|
||||||
|
|
||||||
Release of version 1.1.3
|
|
||||||
|
|
||||||
|
|
||||||
2009-06-14:
|
|
||||||
- replaced C++ style comments for compatibility reasons
|
|
||||||
- added typecasts to suppress compiler warnings
|
|
||||||
- removed redundant source files for ruby-gemfile generation
|
|
||||||
|
|
||||||
2009-08-19:
|
|
||||||
- Changed copyright notice for Public Software Group e. V.
|
|
||||||
- Minor changes in the README file
|
|
||||||
- Release of version 1.1.4
|
|
||||||
|
|
||||||
2009-08-20:
|
|
||||||
- Use RSTRING_PTR() and RSTRING_LEN() instead of RSTRING()->ptr and
|
|
||||||
RSTRING()->len for ruby1.9 compatibility (and #define them, if not
|
|
||||||
existent)
|
|
||||||
|
|
||||||
2009-10-02:
|
|
||||||
- Patches for compatibility with Microsoft Visual Studio
|
|
||||||
|
|
||||||
2009-10-08:
|
|
||||||
- Fixes to make utf8proc usable in C++ programs
|
|
||||||
|
|
||||||
2009-10-16:
|
|
||||||
- Release of version 1.1.5
|
|
||||||
|
|
||||||
2013-11-27:
|
|
||||||
- PostgreSQL 9.2 and 9.3 compatibility (lowercase 'c' language name)
|
|
||||||
- Release of version 1.1.6
|
|
||||||
|
|
||||||
@ -1,4 +1,4 @@
|
|||||||
== libutf8proc license ==
|
## libutf8proc license ##
|
||||||
|
|
||||||
**libutf8proc** is a lightly updated version of the **utf8proc**
|
**libutf8proc** is a lightly updated version of the **utf8proc**
|
||||||
library by Jan Behrens and the rest of the Public Software Group, who
|
library by Jan Behrens and the rest of the Public Software Group, who
|
||||||
@ -27,7 +27,7 @@ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
|||||||
FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||||
DEALINGS IN THE SOFTWARE.
|
DEALINGS IN THE SOFTWARE.
|
||||||
|
|
||||||
== Original utf8proc license ==
|
## Original utf8proc license ##
|
||||||
|
|
||||||
*Copyright (c) 2009, 2013 Public Software Group e. V., Berlin, Germany*
|
*Copyright (c) 2009, 2013 Public Software Group e. V., Berlin, Germany*
|
||||||
|
|
||||||
@ -49,7 +49,7 @@ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
|||||||
FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||||
DEALINGS IN THE SOFTWARE.
|
DEALINGS IN THE SOFTWARE.
|
||||||
|
|
||||||
== Unicode data license ==
|
## Unicode data license ##
|
||||||
|
|
||||||
This software distribution contains derived data from a modified version of
|
This software distribution contains derived data from a modified version of
|
||||||
the Unicode data files. The following license applies to that data:
|
the Unicode data files. The following license applies to that data:
|
||||||
|
|||||||
142
NEWS.md
Normal file
142
NEWS.md
Normal file
@ -0,0 +1,142 @@
|
|||||||
|
# libutf8proc release history #
|
||||||
|
|
||||||
|
No releases so far.
|
||||||
|
|
||||||
|
# utf8proc release history #
|
||||||
|
|
||||||
|
## Version 1.1.6 ##
|
||||||
|
|
||||||
|
2013-11-27:
|
||||||
|
|
||||||
|
- PostgreSQL 9.2 and 9.3 compatibility (lowercase `c` language name)
|
||||||
|
|
||||||
|
## Version 1.1.5 ##
|
||||||
|
|
||||||
|
2009-08-20:
|
||||||
|
|
||||||
|
- Use `RSTRING_PTR()` and `RSTRING_LEN()` instead of `RSTRING()->ptr` and
|
||||||
|
`RSTRING()->len` for ruby1.9 compatibility (and `#define` them, if not
|
||||||
|
existent)
|
||||||
|
|
||||||
|
2009-10-02:
|
||||||
|
|
||||||
|
- Patches for compatibility with Microsoft Visual Studio
|
||||||
|
|
||||||
|
2009-10-08:
|
||||||
|
|
||||||
|
- Fixes to make utf8proc usable in C++ programs
|
||||||
|
|
||||||
|
2009-10-16:
|
||||||
|
|
||||||
|
## Version 1.1.4 ##
|
||||||
|
|
||||||
|
2009-06-14:
|
||||||
|
|
||||||
|
- replaced C++ style comments for compatibility reasons
|
||||||
|
- added typecasts to suppress compiler warnings
|
||||||
|
- removed redundant source files for ruby-gemfile generation
|
||||||
|
|
||||||
|
2009-08-19:
|
||||||
|
|
||||||
|
- Changed copyright notice for Public Software Group e. V.
|
||||||
|
- Minor changes in the `README` file
|
||||||
|
|
||||||
|
## Version 1.1.3 ##
|
||||||
|
|
||||||
|
2008-10-04:
|
||||||
|
|
||||||
|
- Added a function `utf8proc_version` returning a string containing the version
|
||||||
|
number of the library.
|
||||||
|
- Included a target `libutf8proc.dylib` for MacOSX.
|
||||||
|
|
||||||
|
2009-05-01:
|
||||||
|
- PostgreSQL 8.3 compatibility (use of `SET_VARSIZE` macro)
|
||||||
|
|
||||||
|
## Version 1.1.2 ##
|
||||||
|
|
||||||
|
2007-07-25:
|
||||||
|
|
||||||
|
- Fixed a serious bug in the data file generator, which caused characters
|
||||||
|
being treated incorrectly, when stripping default ignorable characters or
|
||||||
|
calculating grapheme cluster boundaries.
|
||||||
|
|
||||||
|
## Version 1.1.1 ##
|
||||||
|
|
||||||
|
2007-06-25:
|
||||||
|
|
||||||
|
- Added a new PostgreSQL function `unistrip`, which behaves like `unifold`,
|
||||||
|
but also removes all character marks (e.g. accents).
|
||||||
|
|
||||||
|
2007-07-22:
|
||||||
|
|
||||||
|
- Changed license from BSD to MIT style.
|
||||||
|
- Added a new function `utf8proc_codepoint_valid` to the C library.
|
||||||
|
- Changed compiler flags in `Makefile` from `-g -O0` to `-O2`
|
||||||
|
- The ruby script, which was used to build the `utf8proc_data.c` file, is now
|
||||||
|
included in the distribution.
|
||||||
|
|
||||||
|
## Version 1.0.3 ##
|
||||||
|
|
||||||
|
2007-03-16:
|
||||||
|
|
||||||
|
- Fixed a bug in the ruby library, which caused an error, when splitting an
|
||||||
|
empty string at grapheme cluster boundaries (method `String#utf8chars`).
|
||||||
|
|
||||||
|
## Version 1.0.2 ##
|
||||||
|
|
||||||
|
2006-09-21:
|
||||||
|
|
||||||
|
- included a check in `Integer#utf8`, which raises an exception, if the given
|
||||||
|
code-point is invalid because of being too high (this was missing yet)
|
||||||
|
|
||||||
|
2006-12-26:
|
||||||
|
|
||||||
|
- added support for PostgreSQL version 8.2
|
||||||
|
|
||||||
|
## Version 1.0.1 ##
|
||||||
|
|
||||||
|
2006-09-20:
|
||||||
|
|
||||||
|
- included a gem file for the ruby version of the library
|
||||||
|
|
||||||
|
Release of version 1.0.1
|
||||||
|
|
||||||
|
## Version 1.0 ##
|
||||||
|
|
||||||
|
2006-09-17:
|
||||||
|
|
||||||
|
- added the `LUMP` option, which lumps certain characters together (see `lump.txt`) (also used for the PostgreSQL `unifold` function)
|
||||||
|
- added the `STRIPMARK` option, which strips marking characters (or marks of composed characters)
|
||||||
|
- deprecated ruby method `String#char_ary` in favour of `String#utf8chars`
|
||||||
|
|
||||||
|
## Version 0.3 ##
|
||||||
|
|
||||||
|
2006-07-18:
|
||||||
|
|
||||||
|
- changed normalization from NFC to NFKC for postgresql unifold function
|
||||||
|
|
||||||
|
2006-08-04:
|
||||||
|
|
||||||
|
- added support to mark the beginning of a grapheme cluster with 0xFF (option: `CHARBOUND`)
|
||||||
|
- added the ruby method `String#chars`, which is returning an array of UTF-8 encoded grapheme clusters
|
||||||
|
- added `NLF2LF` transformation in postgresql `unifold` function
|
||||||
|
- added the `DECOMPOSE` option, if you neither use `COMPOSE` or `DECOMPOSE`, no normalization will be performed (different from previous versions)
|
||||||
|
- using integer constants rather than C-strings for character properties
|
||||||
|
- fixed (hopefully) a problem with the ruby library on Mac OS X, which occured when compiler optimization was switched on
|
||||||
|
|
||||||
|
## Version 0.2 ##
|
||||||
|
|
||||||
|
2006-06-05:
|
||||||
|
|
||||||
|
- changed behaviour of PostgreSQL function to return NULL in case of invalid input, rather than raising an exceptional condition
|
||||||
|
- improved efficiency of PostgreSQL function (no transformation to C string is done)
|
||||||
|
|
||||||
|
2006-06-20:
|
||||||
|
|
||||||
|
- added -fpic compiler flag in Makefile
|
||||||
|
- fixed bug in the C code for the ruby library (usage of non-existent function)
|
||||||
|
|
||||||
|
## Version 0.1 ##
|
||||||
|
|
||||||
|
2006-06-02: initial release of version 0.1
|
||||||
|
|
||||||
12
README.md
12
README.md
@ -1,4 +1,4 @@
|
|||||||
== libutf8proc ==
|
# libutf8proc #
|
||||||
|
|
||||||
The [libutf8proc package](https://github.com/JuliaLang/libutf8proc) is
|
The [libutf8proc package](https://github.com/JuliaLang/libutf8proc) is
|
||||||
a lightly updated fork of the [utf8proc
|
a lightly updated fork of the [utf8proc
|
||||||
@ -28,11 +28,11 @@ data governed by the similarly permissive [Unicode data
|
|||||||
license](http://www.unicode.org/copyright.html#Exhibit1)); please see
|
license](http://www.unicode.org/copyright.html#Exhibit1)); please see
|
||||||
the included `LICENSE.md` file for more detailed information.
|
the included `LICENSE.md` file for more detailed information.
|
||||||
|
|
||||||
=== Quick Start ===
|
## Quick Start ##
|
||||||
|
|
||||||
For compilation of the C library run `make`.
|
For compilation of the C library run `make`.
|
||||||
|
|
||||||
=== General Information ===
|
## General Information ##
|
||||||
|
|
||||||
The C library is found in this directory after successful compilation
|
The C library is found in this directory after successful compilation
|
||||||
and is named `libutf8proc.a` (for the static library) and
|
and is named `libutf8proc.a` (for the static library) and
|
||||||
@ -49,19 +49,19 @@ For Unicode normalizations, the following options are used:
|
|||||||
* Normalization Form KC: `STABLE`, `COMPOSE`, `COMPAT`
|
* Normalization Form KC: `STABLE`, `COMPOSE`, `COMPAT`
|
||||||
* Normalization Form KD: `STABLE`, `DECOMPOSE`, `COMPAT`
|
* Normalization Form KD: `STABLE`, `DECOMPOSE`, `COMPAT`
|
||||||
|
|
||||||
=== C Library ===
|
## C Library ##
|
||||||
|
|
||||||
The documentation for the C library is found in the `utf8proc.h` header file.
|
The documentation for the C library is found in the `utf8proc.h` header file.
|
||||||
`utf8proc_map` is function you will most likely be using for mapping UTF-8
|
`utf8proc_map` is function you will most likely be using for mapping UTF-8
|
||||||
strings, unless you want to allocate memory yourself.
|
strings, unless you want to allocate memory yourself.
|
||||||
|
|
||||||
=== To Do ===
|
## To Do ##
|
||||||
|
|
||||||
* detect stable code points and process segments independently in order to save memory
|
* detect stable code points and process segments independently in order to save memory
|
||||||
* do a quick check before normalizing strings to optimize speed
|
* do a quick check before normalizing strings to optimize speed
|
||||||
* support stream processing
|
* support stream processing
|
||||||
|
|
||||||
=== Contact ===
|
## Contact ##
|
||||||
|
|
||||||
Bug reports, feature requests, and other queries can be filed at
|
Bug reports, feature requests, and other queries can be filed at
|
||||||
the [libutf8proc page on Github](https://github.com/JuliaLang/libutf8proc).
|
the [libutf8proc page on Github](https://github.com/JuliaLang/libutf8proc).
|
||||||
|
|||||||
Loading…
Reference in New Issue
Block a user