Python codecs Library: Error Handler Functions:
The purpose of this module is Encoding and decoding i.e. conversion of the texts between different representations.
This module defines base classes for standard Python codecs (encoders and decoders) and provides access to the internal Python codec registry, which manages the codec and error handling lookup process.
Some functions for implementing the error handling schemes are:
1. codecs.strict_errors(exception):
This function implements the 'strict' error handling mechanism.
In the 'strict' error handling scheme, it raises the UnicodeError or its subclasses.
2. codecs.ignore_errors(exception):
This function implements the 'ignore' error handling mechanism.
In the 'ignore' error handling scheme, it ignores the malformed data and continue without further notice.
3. codecs.replace_errors(exception):
This function implements the 'replace' error handling scheme.
In the 'replace' error handling scheme, the malformed data is replaced with a replacement marker. During encoding, the replacement character is '?' and during decoding the replacement character is "U+FFFD"-'�', which is the official replacement character.
4. codecs.backslashreplace_errors(exception):
This function implements the 'backslashreplace' error handling scheme.
In the 'backslashreplace' error handing scheme, it replaces the malformed data with backslashed escape sequences. On encoding, it uses hexadecimal form of Unicode code point with formats '\xhh', '\uxxxx' or '\Uxxxxxxxx'. On decoding, it uses hexadecimal form of the byte value with format '\xhh'.
5. codecs.xmlcharrefreplace_errors(exception):
This function implements the 'xmlcharrefreplace' error handling scheme.
Tn the 'xmlcharrefreplace' error handling scheme, it replaces the malformed data with the XML/HTML numeric character reference, which is decimal form of Unicode code point with the format '&#num';
6. codecs.namereplace_errors(exception):
This function implements the 'namereplace' error handling scheme.
Tn the 'namereplace' error handling scheme, it replaces the malformed data with '\N{…}' escape sequences, what appears in the braces is the name property from Unicode Character Database.
Example of how these error mechanisms are implemented:
Comments