README updates

This commit is contained in:
Steven G. Johnson 2014-12-07 21:29:34 -05:00
parent df71da45df
commit 1c84d08b01
2 changed files with 19 additions and 19 deletions

View File

@ -7,7 +7,7 @@ whose copyright and license statements are reproduced below, all new
work on the libmojibake library is licensed under the [MIT "expat" work on the libmojibake library is licensed under the [MIT "expat"
license](http://opensource.org/licenses/MIT): license](http://opensource.org/licenses/MIT):
*Copyright © 2014 by Steven G. Johnson.* *Copyright © 2014 by Steven G. Johnson, Jiahao Chen, Tony Kelman, and other contributors listed in the git history.*
Permission is hereby granted, free of charge, to any person obtaining a Permission is hereby granted, free of charge, to any person obtaining a
copy of this software and associated documentation files (the "Software"), copy of this software and associated documentation files (the "Software"),

View File

@ -1,28 +1,30 @@
# libmojibake # libmojibake
[![Build Status](https://travis-ci.org/JuliaLang/libmojibake.png)](https://travis-ci.org/JuliaLang/libmojibake) [![Build Status](https://travis-ci.org/JuliaLang/libmojibake.png)](https://travis-ci.org/JuliaLang/libmojibake)
[libmojibake](https://github.com/JuliaLang/libmojibake) is [libmojibake](https://github.com/JuliaLang/libmojibake) is a
a lightly updated fork of the [utf8proc development fork of the [utf8proc
library](http://www.public-software-group.org/utf8proc) from Jan library](http://www.public-software-group.org/utf8proc) from Jan
Behrens and the rest of the [Public Software Behrens and the rest of the [Public Software
Group](http://www.public-software-group.org/), who deserve *nearly all Group](http://www.public-software-group.org/), who deserve *nearly all
of the credit* for this package: a small, clean C library that of the credit* for this package: a small, clean C library that
provides Unicode normalization, case-folding, and other operations for provides Unicode normalization, case-folding, and other operations for
data in the [UTF-8 encoding](http://en.wikipedia.org/wiki/UTF-8). data in the [UTF-8 encoding](http://en.wikipedia.org/wiki/UTF-8). The
main difference from utf8proc is that the Unicode support in
libmojibake is more up-to-date (Unicode 7 vs. Unicode 5).
The reason for this fork is that `utf8proc` is used for basic Unicode The reason for this fork is that utf8proc is used for basic Unicode
support in the [Julia language](http://julialang.org/) and the Julia support in the [Julia language](http://julialang.org/) and the Julia
developers wanted Unicode 7 support and other features, but the Public developers wanted Unicode 7 support and other features, but the Public
Software Group is currently occupied with other projects. We hope Software Group is currently occupied with other projects. As we implement
that our fork can be merged back into the mainline `utf8proc` package and test new features in libmojibake, we are contributing patches back
before too long. to utf8proc with the hope that they can be merged upstream.
(The original `utf8proc` package also includes Ruby and PostgreSQL plug-ins. (The original utf8proc package also includes Ruby and PostgreSQL plug-ins.
We removed those from `libmojibake` in order to focus exclusively on the C We removed those from libmojibake in order to focus exclusively on the C
library for the time being. We will strive to keep API changes to a minimum, library for the time being. We will strive to keep API changes to a minimum,
so `libmojibake` should still be usable with the old plug-in code.) so libmojibake should still be usable with the old plug-in code.)
Like `utf8proc`, the `libmojibake` package is licensed under the Like utf8proc, the libmojibake package is licensed under the
free/open-source [MIT "expat" free/open-source [MIT "expat"
license](http://opensource.org/licenses/MIT) (plus certain Unicode license](http://opensource.org/licenses/MIT) (plus certain Unicode
data governed by the similarly permissive [Unicode data data governed by the similarly permissive [Unicode data
@ -39,9 +41,9 @@ The C library is found in this directory after successful compilation
and is named `libmojibake.a` (for the static library) and and is named `libmojibake.a` (for the static library) and
`libmojibake.so` (for the dynamic library). `libmojibake.so` (for the dynamic library).
The Unicode version being supported is 5.0.0. The Unicode version being supported is 7.0.0. (Grapheme segmentation
*Note:* Version 4.1.0 of Unicode Standard Annex #29 was used, as is currently based on version 4.1.0 of Unicode Standard Annex #29, but
version 5.0.0 had not been available at the time of implementation. we hope to update this soon.)
For Unicode normalizations, the following options are used: For Unicode normalizations, the following options are used:
@ -58,12 +60,10 @@ strings, unless you want to allocate memory yourself.
## To Do ## ## To Do ##
* detect stable code points and process segments independently in order to save memory See the Github [issues list](https://github.com/JuliaLang/libmojibake/issues).
* do a quick check before normalizing strings to optimize speed
* support stream processing
## Contact ## ## Contact ##
Bug reports, feature requests, and other queries can be filed at Bug reports, feature requests, and other queries can be filed at
the [libmojibake page on Github](https://github.com/JuliaLang/libmojibake/issues). the [libmojibake issues page on Github](https://github.com/JuliaLang/libmojibake/issues).