markdown and other cosmetic updates
This commit is contained in:
parent
c0f2b512a0
commit
0d7224a6d8
10
.gitignore
vendored
Normal file
10
.gitignore
vendored
Normal file
@ -0,0 +1,10 @@
|
||||
*.tar.gz
|
||||
*.exe
|
||||
*.dll
|
||||
*.do
|
||||
*.o
|
||||
*.so
|
||||
*.a
|
||||
*.dll
|
||||
*.dylib
|
||||
*.dSYM
|
||||
@ -1,5 +1,13 @@
|
||||
== libutf8proc license ==
|
||||
|
||||
Copyright (c) 2009, 2013 Public Software Group e. V., Berlin, Germany
|
||||
**libutf8proc** is a lightly updated version of the **utf8proc**
|
||||
library by Jan Behrens and the rest of the Public Software Group, who
|
||||
deserve nearly all of the credit for this library. Like utf8proc,
|
||||
whose copyright and license statements are reproduced below, all new
|
||||
work on the libutf8proc library is licensed under the [MIT "expat"
|
||||
license](http://opensource.org/licenses/MIT):
|
||||
|
||||
*Copyright © 2014 by Steven G. Johnson.*
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a
|
||||
copy of this software and associated documentation files (the "Software"),
|
||||
@ -19,14 +27,37 @@ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
DEALINGS IN THE SOFTWARE.
|
||||
|
||||
== Original utf8proc license ==
|
||||
|
||||
*Copyright (c) 2009, 2013 Public Software Group e. V., Berlin, Germany*
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a
|
||||
copy of this software and associated documentation files (the "Software"),
|
||||
to deal in the Software without restriction, including without limitation
|
||||
the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
||||
and/or sell copies of the Software, and to permit persons to whom the
|
||||
Software is furnished to do so, subject to the following conditions:
|
||||
|
||||
The above copyright notice and this permission notice shall be included in
|
||||
all copies or substantial portions of the Software.
|
||||
|
||||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
||||
FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
||||
DEALINGS IN THE SOFTWARE.
|
||||
|
||||
== Unicode data license ==
|
||||
|
||||
This software distribution contains derived data from a modified version of
|
||||
the Unicode data files. The following license applies to that data:
|
||||
|
||||
COPYRIGHT AND PERMISSION NOTICE
|
||||
**COPYRIGHT AND PERMISSION NOTICE**
|
||||
|
||||
Copyright (c) 1991-2007 Unicode, Inc. All rights reserved. Distributed
|
||||
under the Terms of Use in http://www.unicode.org/copyright.html.
|
||||
*Copyright (c) 1991-2007 Unicode, Inc. All rights reserved. Distributed
|
||||
under the Terms of Use in http://www.unicode.org/copyright.html.*
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a
|
||||
copy of the Unicode data files and any associated documentation (the "Data
|
||||
@ -57,8 +88,6 @@ not be used in advertising or otherwise to promote the sale, use or other
|
||||
dealings in these Data Files or Software without prior written
|
||||
authorization of the copyright holder.
|
||||
|
||||
|
||||
Unicode and the Unicode logo are trademarks of Unicode, Inc., and may be
|
||||
registered in some jurisdictions. All other trademarks and registered
|
||||
trademarks mentioned herein are the property of their respective owners.
|
||||
|
||||
41
Makefile
41
Makefile
@ -9,20 +9,12 @@ cc = $(CC) $(cflags)
|
||||
|
||||
# meta targets
|
||||
|
||||
all: c-library
|
||||
|
||||
c-library: libutf8proc.a libutf8proc.so
|
||||
|
||||
ruby-library: ruby/utf8proc_native.so
|
||||
|
||||
pgsql-library: pgsql/utf8proc_pgsql.so
|
||||
|
||||
all: c-library ruby-library ruby-gem pgsql-library
|
||||
|
||||
clean::
|
||||
clean:
|
||||
rm -f utf8proc.o libutf8proc.a libutf8proc.so
|
||||
cd ruby/ && test -e Makefile && (make clean && rm -f Makefile) || true
|
||||
rm -Rf ruby/gem/lib ruby/gem/ext
|
||||
rm -f ruby/gem/utf8proc-*.gem
|
||||
cd pgsql/ && make clean
|
||||
|
||||
# real targets
|
||||
|
||||
@ -39,30 +31,3 @@ libutf8proc.so: utf8proc.o
|
||||
|
||||
libutf8proc.dylib: utf8proc.o
|
||||
$(cc) -dynamiclib -o $@ $^ -install_name $(libdir)/$@
|
||||
|
||||
ruby/Makefile: ruby/extconf.rb
|
||||
cd ruby && ruby extconf.rb
|
||||
|
||||
ruby/utf8proc_native.so: utf8proc.h utf8proc.c utf8proc_data.c \
|
||||
ruby/utf8proc_native.c ruby/Makefile
|
||||
cd ruby && make
|
||||
|
||||
ruby/gem/lib/utf8proc.rb: ruby/utf8proc.rb
|
||||
test -e ruby/gem/lib || mkdir ruby/gem/lib
|
||||
cp ruby/utf8proc.rb ruby/gem/lib/
|
||||
|
||||
ruby/gem/ext/extconf.rb: ruby/extconf.rb
|
||||
test -e ruby/gem/ext || mkdir ruby/gem/ext
|
||||
cp ruby/extconf.rb ruby/gem/ext/
|
||||
|
||||
ruby/gem/ext/utf8proc_native.c: utf8proc.h utf8proc_data.c utf8proc.c ruby/utf8proc_native.c
|
||||
test -e ruby/gem/ext || mkdir ruby/gem/ext
|
||||
cat utf8proc.h utf8proc_data.c utf8proc.c ruby/utf8proc_native.c | grep -v '#include "utf8proc.h"' | grep -v '#include "utf8proc_data.c"' | grep -v '#include "../utf8proc.c"' > ruby/gem/ext/utf8proc_native.c
|
||||
|
||||
ruby-gem:: ruby/gem/lib/utf8proc.rb ruby/gem/ext/extconf.rb ruby/gem/ext/utf8proc_native.c
|
||||
cd ruby/gem && gem build utf8proc.gemspec
|
||||
|
||||
pgsql/utf8proc_pgsql.so: utf8proc.h utf8proc.c utf8proc_data.c \
|
||||
pgsql/utf8proc_pgsql.c
|
||||
cd pgsql && make
|
||||
|
||||
|
||||
63
README
63
README
@ -1,63 +0,0 @@
|
||||
|
||||
Please read the LICENSE file, which is shipping with this software.
|
||||
|
||||
|
||||
*** QUICK START ***
|
||||
|
||||
For compilation of the C library call "make c-library", for compilation of
|
||||
the ruby library call "make ruby-library" and for compilation of the
|
||||
PostgreSQL extension call "make pgsql-library".
|
||||
|
||||
For ruby you can also create a gem-file by calling "make ruby-gem".
|
||||
|
||||
"make all" can be used to build everything, but both ruby and PostgreSQL
|
||||
installations are required in this case.
|
||||
|
||||
|
||||
*** GENERAL INFORMATION ***
|
||||
|
||||
The C library is found in this directory after successful compilation and
|
||||
is named "libutf8proc.a" and "libutf8proc.so". The ruby library consists of
|
||||
the files "utf8proc.rb" and "utf8proc_native.so", which are found in the
|
||||
subdirectory "ruby/". If you chose to create a gem-file it is placed in the
|
||||
"ruby/gem" directory. The PostgreSQL extension is named "utf8proc_pgsql.so"
|
||||
and resides in the "pgsql/" directory.
|
||||
|
||||
Both the ruby library and the PostgreSQL extension are built as stand-alone
|
||||
libraries and are therefore not dependent the dynamic version of the
|
||||
C library files, but this behaviour might change in future releases.
|
||||
|
||||
The Unicode version being supported is 5.0.0.
|
||||
Note: Version 4.1.0 of Unicode Standard Annex #29 was used, as
|
||||
version 5.0.0 had not been available at the time of implementation.
|
||||
|
||||
For Unicode normalizations, the following options have to be used:
|
||||
Normalization Form C: STABLE, COMPOSE
|
||||
Normalization Form D: STABLE, DECOMPOSE
|
||||
Normalization Form KC: STABLE, COMPOSE, COMPAT
|
||||
Normalization Form KD: STABLE, DECOMPOSE, COMPAT
|
||||
|
||||
|
||||
*** C LIBRARY ***
|
||||
|
||||
The documentation for the C library is found in the utf8proc.h header file.
|
||||
"utf8proc_map" is most likely function you will be using for mapping UTF-8
|
||||
strings, unless you want to allocate memory yourself.
|
||||
|
||||
|
||||
*** TODO ***
|
||||
|
||||
- detect stable code points and process segments independently in order to
|
||||
save memory
|
||||
- do a quick check before normalizing strings to optimize speed
|
||||
- support stream processing
|
||||
|
||||
|
||||
*** CONTACT ***
|
||||
|
||||
If you find any bugs or experience difficulties in compiling this software,
|
||||
please contact us:
|
||||
|
||||
Project page: http://www.public-software-group.org/utf8proc
|
||||
|
||||
|
||||
68
README.md
Normal file
68
README.md
Normal file
@ -0,0 +1,68 @@
|
||||
== libutf8proc ==
|
||||
|
||||
The [libutf8proc package](https://github.com/JuliaLang/libutf8proc) is
|
||||
a lightly updated fork of the [utf8proc
|
||||
library](http://www.public-software-group.org/utf8proc) from Jan
|
||||
Behrens and the rest of the [Public Software
|
||||
Group](http://www.public-software-group.org/), who deserve *nearly all
|
||||
of the credit* for this package: a small, clean C library that
|
||||
provides Unicode normalization, case-folding, and other operations for
|
||||
data in the [UTF-8 encoding](http://en.wikipedia.org/wiki/UTF-8).
|
||||
|
||||
The reason for this fork is that utf8proc is used for basic Unicode
|
||||
support in the [Julia language](http://julialang.org/) and the Julia
|
||||
developers wanted Unicode 7 support and other features, but the
|
||||
Public Software Group currently does not seem to have the resources
|
||||
necessary to update utf8proc. We hope that the fork can be merged
|
||||
back into the mainline utf8proc package before too long.
|
||||
|
||||
(The original utf8proc package also includes Ruby and PostgreSQL plug-ins.
|
||||
We removed those from libutf8proc in order to focus exclusively on the C
|
||||
library for the time being. We will strive to keep API changes to a minimum,
|
||||
so libutf8proc should still be usable with the old plug-in code.)
|
||||
|
||||
Like utf8proc, the libutf8proc package is licensed under the
|
||||
free/open-source [MIT "expat"
|
||||
license](http://opensource.org/licenses/MIT) (plus certain Unicode
|
||||
data governed by the similarly permissive [Unicode data
|
||||
license](http://www.unicode.org/copyright.html#Exhibit1)); please see
|
||||
the included `LICENSE.md` file for more detailed information.
|
||||
|
||||
=== Quick Start ===
|
||||
|
||||
For compilation of the C library run `make`.
|
||||
|
||||
=== General Information ===
|
||||
|
||||
The C library is found in this directory after successful compilation
|
||||
and is named `libutf8proc.a` (for the static library) and
|
||||
`libutf8proc.so` (for the dynamic library).
|
||||
|
||||
The Unicode version being supported is 5.0.0.
|
||||
*Note:* Version 4.1.0 of Unicode Standard Annex #29 was used, as
|
||||
version 5.0.0 had not been available at the time of implementation.
|
||||
|
||||
For Unicode normalizations, the following options are used:
|
||||
|
||||
* Normalization Form C: `STABLE`, COMPOSE`
|
||||
* Normalization Form D: `STABLE`, `DECOMPOSE`
|
||||
* Normalization Form KC: `STABLE`, `COMPOSE`, `COMPAT`
|
||||
* Normalization Form KD: `STABLE`, `DECOMPOSE`, `COMPAT`
|
||||
|
||||
=== C Library ===
|
||||
|
||||
The documentation for the C library is found in the `utf8proc.h` header file.
|
||||
`utf8proc_map` is function you will most likely be using for mapping UTF-8
|
||||
strings, unless you want to allocate memory yourself.
|
||||
|
||||
=== To Do ===
|
||||
|
||||
* detect stable code points and process segments independently in order to save memory
|
||||
* do a quick check before normalizing strings to optimize speed
|
||||
* support stream processing
|
||||
|
||||
=== Contact ===
|
||||
|
||||
Bug reports, feature requests, and other queries can be filed at
|
||||
the [libutf8proc page on Github](https://github.com/JuliaLang/libutf8proc).
|
||||
|
||||
Loading…
Reference in New Issue
Block a user