Python codecs Library Error Handlers














































Python codecs Library Error Handlers



Python codecs Library: Error Handlers

This helps the system know about the technique with which they need to handle the errors occurred in the encoding and decoding.
Codecs have defined various error handling schemes to simplify and standardize error handling.
The schemes can be implemented by accepting the 'errors' string argument in the various encoding and decoding functions in this library.

The various error handlers that can be used with all the Python standard encodings:

1."strict":
When passed this in the 'errors' argument of the functions, it raises UnicodeError or its subclass.
This is implemented using an explicit function in this module named 'codecs.strict_errors(exception)'.
This is the default error scheme used.

2."ignore":
When this is passes as the error argument of the functions in their calls, it implements the error handling scheme which ignores the malformed data and continue without further notice.
This is implemented using an explicit function in this module named 'codecs.ignore_errors(exception)'.

3."replace":
When this is passed in errors, it replaces errors with a replacement marker. On encoding, the replacement marker used is '?' and on decoding the replacement marker used is '�'-"U+FFFD". These are the official replacement characters.
This is implemented using an explicit function in this module named 'codecs.replace_errors(exception)'.

4."backslashreplace":
When this is passed in errors argument of the module function calls, it replaces the errors with the backslashed escape sequences. On encoding, hexadecimal form of the Unicode code points are used with the formats "\xhh", "\uxxxx" or "\Uxxxxxxxx". On decoding, the hexadecimal form of byte value with the format "\xhh". 
This is implemented using an explicit function in this module named 'codecs.backslashreplace_errors(exception)'.

5."surrogateescape":
When this is passed in errors argument of the module function calls, it replaces the errors. On decoding, it replaces byte with individual surrogate code ranging from "U+DC80" to "U+DCFF". This code will be turned back to the same byte when 'surrogateescape' error handler is used while encoding the data.

Some of the error handlers are only applicable while encoding of the data:

 1."xmlcharrefreplace":
 This is only used with encoding functions, this when passed as the error argument will replace the errors with XML/HTML numeric character reference, which is a decimal form of Unicode code point with format '&#num;'.
 This is implemented using an explicit function in this module named 'codecs.xmlcharrefreplace_errors(exception)'.

 2. "namereplace":
 This is only used with encoding functions, this when passed as the error argument will replace the errors with '\N{…}'escape sequences, what appears in the braces is the Name property from Unicode Character Database.
 This is implemented using an explicit function in this module named 'codecs.namereplace_errors(exception)'.

Some other error handler for specific codecs:

1.'surrogatepass':
This is a special error handler scheme specific to the given codecs -->{utf-8, utf-16, utf-32, utf-16-be, utf-16-le, utf-32-be, utf-32-le}.
It allows encoding and decoding of the surrogate code point (U+D800 to U+DFFF) as normal code point. Otherwise these codecs treat the presence of surrogate code point in str as an error.

More Articles of Arkaja Sharan:

Name Views Likes
Python codecs Library Error Handling schemes module functions 120 0
Python codecs Library Error Handler register_error and lookup_error functions 120 0
Python codecs Library Error Handlers 137 0
Python codecs Library open and EncodedFile functions 120 0
Python codecs Library iterencode and iterdecode functions 136 0
Python codecs Library register and unregister functions 104 0
Python codecs Library getreader and getwriter functions 121 0
Python codecs Library getincrementalencoder and getincrementaldecoder 103 0
Python codecs Library getencoder and getdecoder functions 115 0
Python Introduction to codecs Library 140 0
Python fcntl Library flock and lockf functions 126 0
Python fcntl Library fcntl and ioctl functions 142 0
Python Resource Library resource usage functions 125 0
Python Resource Library resource usage symbolic constants 108 0
Python Resource Library Resource Limit Functions 128 0
Python resource library resource limit symbolic constants 121 0
Python Introduction to Resource Library 132 0
Python stringprep Library in_table_d1 and in_table_d2 functions 117 0
Python stringprep Library in_table_c8 and in_table_c9 functions 112 0
Python stringprep Library in_table_c5 in_table_c6 and in_table_c7 functions 106 0
Python stringprep Library in_table_c3 and in_table_c4 functions 110 0
Python stringprep library in_table_c21 in_table_c22 and in_table_c21_c22 116 0
Python stringprep library functions in_table_c11 in_table_c12 and in_table_c11_c12 113 0
Python Introduction to stringprep Library 125 0
Python unicodedata library is_normalized unidata_version and ucd_3_2_0 111 0
Python Unicodedata Library functions normalize and decomposition 166 0
Python Unicodedata Library functions east_asian_width and mirrored 111 1
Python Unicodedata Library category bidirectional and combining functions 163 0
Introduction to Unicodedata library lookup and name functions 112 0
Unicode Library decimal digit and numeric functions 118 0
Introduction to Unicode Data library 0 0

Comments