Welcome | Get started | Dive | Contribute | Topics | Reference | Changes | More

lino.utils.pyuca

Preliminary implementation of the Unicode Collation Algorithm.

This only implements the simple parts of the algorithm but I have successfully tested it using the Default Unicode Collation Element Table (DUCET) to collate Ancient Greek correctly.

Usage example:

from pyuca import Collator c = Collator(“allkeys.txt”)

sorted_words = sorted(words, key=c.sort_key)

allkeys.txt (1 MB) is available at

but you can always subset this for just the characters you are dealing with.

Classes

Collator(filename)

Trie()