Name: DFA UTF-8 Decoder Short Name: utf8-decoder URL: http://bjoern.hoehrmann.de/utf-8/decoder/dfa/ Version: N/A Date: 2010-06-25 License: MIT License File: LICENSE Security Critical: yes Shipped: yes Description: Decodes UTF-8 bytes using a fast and simple definite finite automata. Local modifications: - Rejection state has been mapped to row 0 (instead of row 1) of the DFA, saving some 50 bytes and making the table easier to reason about. - The transitions have been remapped to represent both a state transition and a bit mask for the incoming byte. - The caller must now zero out the code point buffer after successful or unsuccessful state transitions. - Specifically for generalized-utf8-decoder.h: we adapt the original decoder to decode and validate "generalized UTF-8", a variant of UTF-8 used in WTF-8 that can encode surrogates. See https://simonsapin.github.io/wtf-8/#generalized-utf8. There is one fewer state and so the transition table is smaller by one in both dimensions.