Python codecs Library
The purpose of this module is Encoding and decoding i.e. conversion of the texts between different representations.
This module defines base classes for standard Python codecs (encoders and decoders) and provides access to the internal Python codec registry, which manages the codec and error handling lookup process.
The next two functions defined here are :
1. codecs.open(filename, mode='r', encoding=None, errors='strict', buffering=-1):
2. codecs.EncodedFile(file, data_encoding, file_encoding=None, errors='strict'):
codecs.open(filename, mode='r', encoding=None, errors='strict', buffering=- 1):
This function is used to open the encoded file using the given mode (read or write given as the second argument).
It returns an instance of StreamReaderWriter, providing transparent encoding or decoding.
Talking about the arguments the function takes:
The first argument is the name of the file which you want to open using this method.
The second argument is the mode with which you want to open the file. The default mode for mode is 'r' i.e. 'read' which means to open the file in the read mode. It must be noted that the underlying encoded files must be opened in the binary mode otherwise no automatic conversion of '\n' is done on reading and writing. Whatever be the mode of opening the file, it can be any binary mode acceptable to the built-in open() function; the 'b' is automatically added to it. Opening the file in the binary mode avoids the data loss that may occur when dealing with 8-bit encoding.
The different modes that can be passed are:
'r': open file for reading
'w': open file for writing
'rb': open file in binary format for reading.
'wb': open file in binary format for writing.
The third argument is encoding. This specifies the encoding which is to be used for the file. Any encoding that encodes to and decodes from bytes is allowed and the data types supported by the file methods depend on the codec used.
The fourth argument 'errors' is passed and by default errors='strict' which causes a ValueError to be raised in case of encoding error occurs.
The last argument buffering is used to keep some buffer size for storing a chunk of file in a temporary memory until the file loads completely. It defaults to -1 which means that the default buffer size needs to be used.
EXAMPLE:
codecs.EncodedFile(file, data_encoding, file_encoding=None, errors='strict'):
This function takes an open file handle using one encoding and wraps it with a class that translates the data to another encoding as the I/O occurs.
It returns a StreamRecoder instance, a wrapped version of the file which provides the transparent transcoding. The original file is closed when the wrapped version is closed. All the data written to the wrapped file id decoded according to the given 'data_encoding' and then written to the original file as bytes using file_encoding. Bytes read from the original file are decoded according to the file_encoding, and the result is encoded using data_encoding. If the file_encoding is not provided then the file encoding defaults to the data_encoding. The errors are to defining the error handling and it defaults to 'strict', which causes ValueError to be raised in case an encoding error occurs.
Comments