The conversion chain is an ordered list of Conversion objects, each node relies on a dictionary to replace segments with target values through longest prefix matching.
Supports advanced scenarios like phrase priority, variant character replacement, and multi-stage composition.
TextDict (.txt) builds dictionaries from tab-delimited plain text; MarisaDict (.ocd2) provides high-performance trie structures; DictGroup can compose multiple dictionaries into a sequential collection.
SerializableDict defines serialization and file loading logic, which command-line tools use to convert between different formats.
API Encapsulation
SimpleConverter (high-level C++ interface) encapsulates Config + Converter, providing various overloads for string, pointer buffer, and partial length conversion.
The command-line program opencc (src/tools/CommandLine.cpp) demonstrates batch conversion, stream reading, auto-flushing, and same-file input/output handling.
Dictionary
Interface
Dict: Declares Match and related functions.
SerializableDict: Declares dictionary serialization and deserialization functions.
Implementations
TextDict: Tabular separated dictionary format.
BinaryDict: Stores keys and values in binary format. For serialization only.
DartsDict: Double-array trie (.ocd).
MarisaDict: Marisa trie (.ocd2).
DictGroup: A wrap of a group of dictionaries. Iterates one by one until a match.