README updates
This commit is contained in:
parent
df71da45df
commit
1c84d08b01
@ -7,7 +7,7 @@ whose copyright and license statements are reproduced below, all new
|
|||||||
work on the libmojibake library is licensed under the [MIT "expat"
|
work on the libmojibake library is licensed under the [MIT "expat"
|
||||||
license](http://opensource.org/licenses/MIT):
|
license](http://opensource.org/licenses/MIT):
|
||||||
|
|
||||||
*Copyright © 2014 by Steven G. Johnson.*
|
*Copyright © 2014 by Steven G. Johnson, Jiahao Chen, Tony Kelman, and other contributors listed in the git history.*
|
||||||
|
|
||||||
Permission is hereby granted, free of charge, to any person obtaining a
|
Permission is hereby granted, free of charge, to any person obtaining a
|
||||||
copy of this software and associated documentation files (the "Software"),
|
copy of this software and associated documentation files (the "Software"),
|
||||||
|
|||||||
36
README.md
36
README.md
@ -1,28 +1,30 @@
|
|||||||
# libmojibake
|
# libmojibake
|
||||||
[](https://travis-ci.org/JuliaLang/libmojibake)
|
[](https://travis-ci.org/JuliaLang/libmojibake)
|
||||||
|
|
||||||
[libmojibake](https://github.com/JuliaLang/libmojibake) is
|
[libmojibake](https://github.com/JuliaLang/libmojibake) is a
|
||||||
a lightly updated fork of the [utf8proc
|
development fork of the [utf8proc
|
||||||
library](http://www.public-software-group.org/utf8proc) from Jan
|
library](http://www.public-software-group.org/utf8proc) from Jan
|
||||||
Behrens and the rest of the [Public Software
|
Behrens and the rest of the [Public Software
|
||||||
Group](http://www.public-software-group.org/), who deserve *nearly all
|
Group](http://www.public-software-group.org/), who deserve *nearly all
|
||||||
of the credit* for this package: a small, clean C library that
|
of the credit* for this package: a small, clean C library that
|
||||||
provides Unicode normalization, case-folding, and other operations for
|
provides Unicode normalization, case-folding, and other operations for
|
||||||
data in the [UTF-8 encoding](http://en.wikipedia.org/wiki/UTF-8).
|
data in the [UTF-8 encoding](http://en.wikipedia.org/wiki/UTF-8). The
|
||||||
|
main difference from utf8proc is that the Unicode support in
|
||||||
|
libmojibake is more up-to-date (Unicode 7 vs. Unicode 5).
|
||||||
|
|
||||||
The reason for this fork is that `utf8proc` is used for basic Unicode
|
The reason for this fork is that utf8proc is used for basic Unicode
|
||||||
support in the [Julia language](http://julialang.org/) and the Julia
|
support in the [Julia language](http://julialang.org/) and the Julia
|
||||||
developers wanted Unicode 7 support and other features, but the Public
|
developers wanted Unicode 7 support and other features, but the Public
|
||||||
Software Group is currently occupied with other projects. We hope
|
Software Group is currently occupied with other projects. As we implement
|
||||||
that our fork can be merged back into the mainline `utf8proc` package
|
and test new features in libmojibake, we are contributing patches back
|
||||||
before too long.
|
to utf8proc with the hope that they can be merged upstream.
|
||||||
|
|
||||||
(The original `utf8proc` package also includes Ruby and PostgreSQL plug-ins.
|
(The original utf8proc package also includes Ruby and PostgreSQL plug-ins.
|
||||||
We removed those from `libmojibake` in order to focus exclusively on the C
|
We removed those from libmojibake in order to focus exclusively on the C
|
||||||
library for the time being. We will strive to keep API changes to a minimum,
|
library for the time being. We will strive to keep API changes to a minimum,
|
||||||
so `libmojibake` should still be usable with the old plug-in code.)
|
so libmojibake should still be usable with the old plug-in code.)
|
||||||
|
|
||||||
Like `utf8proc`, the `libmojibake` package is licensed under the
|
Like utf8proc, the libmojibake package is licensed under the
|
||||||
free/open-source [MIT "expat"
|
free/open-source [MIT "expat"
|
||||||
license](http://opensource.org/licenses/MIT) (plus certain Unicode
|
license](http://opensource.org/licenses/MIT) (plus certain Unicode
|
||||||
data governed by the similarly permissive [Unicode data
|
data governed by the similarly permissive [Unicode data
|
||||||
@ -39,9 +41,9 @@ The C library is found in this directory after successful compilation
|
|||||||
and is named `libmojibake.a` (for the static library) and
|
and is named `libmojibake.a` (for the static library) and
|
||||||
`libmojibake.so` (for the dynamic library).
|
`libmojibake.so` (for the dynamic library).
|
||||||
|
|
||||||
The Unicode version being supported is 5.0.0.
|
The Unicode version being supported is 7.0.0. (Grapheme segmentation
|
||||||
*Note:* Version 4.1.0 of Unicode Standard Annex #29 was used, as
|
is currently based on version 4.1.0 of Unicode Standard Annex #29, but
|
||||||
version 5.0.0 had not been available at the time of implementation.
|
we hope to update this soon.)
|
||||||
|
|
||||||
For Unicode normalizations, the following options are used:
|
For Unicode normalizations, the following options are used:
|
||||||
|
|
||||||
@ -58,12 +60,10 @@ strings, unless you want to allocate memory yourself.
|
|||||||
|
|
||||||
## To Do ##
|
## To Do ##
|
||||||
|
|
||||||
* detect stable code points and process segments independently in order to save memory
|
See the Github [issues list](https://github.com/JuliaLang/libmojibake/issues).
|
||||||
* do a quick check before normalizing strings to optimize speed
|
|
||||||
* support stream processing
|
|
||||||
|
|
||||||
## Contact ##
|
## Contact ##
|
||||||
|
|
||||||
Bug reports, feature requests, and other queries can be filed at
|
Bug reports, feature requests, and other queries can be filed at
|
||||||
the [libmojibake page on Github](https://github.com/JuliaLang/libmojibake/issues).
|
the [libmojibake issues page on Github](https://github.com/JuliaLang/libmojibake/issues).
|
||||||
|
|
||||||
|
|||||||
Loading…
Reference in New Issue
Block a user