lino.utils.pyuca¶

Preliminary implementation of the Unicode Collation Algorithm.

This only implements the simple parts of the algorithm but I have successfully tested it using the Default Unicode Collation Element Table (DUCET) to collate Ancient Greek correctly.

Usage example:

from pyuca import Collator c = Collator(“allkeys.txt”)

sorted_words = sorted(words, key=c.sort_key)

allkeys.txt (1 MB) is available at

http://www.unicode.org/Public/UCA/latest/allkeys.txt

but you can always subset this for just the characters you are dealing with.

Classes

`Collator`(filename)
`Trie`()