Python - urllib.request.Request()

Python - urllib.request.Request()


HTTP is generally based on requests and responses - the client makes requests and servers send responses. 

urllib.request mirrors this with a Request object which represents the HTTP request you are making. 

In its simplest form, you create a Request object that specifies the URL you want to fetch. 

Class urllib.request.Request() is basically an abstraction of a URL request.

Recall that abstraction is basically hiding all the unnecessary information but the relevant data about an object is kept in order to reduce complexity and increase efficiency.

Class urllib.request.Request is defined as:

class urllib.request.Request(urldata=Noneheaders{}origin_req_host=Noneunverifiable=False, method=None)


  1. url should be a string containing a valid URL.
  2. data must be an object specifying additional data to send to the server, or None if no such data is needed. Currently, HTTP requests are the only ones that use data. The supported object types include bytes, file-like objects, and iterables of bytes-like objects. For and HTTP POST request Methoddata should be a buffer in the standard application/x-www-form-urlencoded format.
  3. headers should be a dictionary and will be treated as if add_header() was called with each key
    and value as arguments. This is often used to ''spoof'' the 
    User-Agent header value, which is used by a browser to identify itself as some HTTP servers only allow requests coming from common browsers as opposed to scripts.
  4. origin_req_host should be the request-host of the origin transaction, as defined by RFC 2965. It defaults to http.cookiejar.request_host(self). This is the host name or IP address of the original request that was initiated by the user. For example, if the request is for an image in an HTML document, this should be the request-host of the request for the page containing the image.
  5. unverifiable should indicate whether the request is unverifiable, as defined by RFC 2965. It defaults to False. An unverifiable request is one whose URL the user did not have the option to approve. For example, if the request is for an image in an HTML document, and the user had no option to approve the automatic fetching of the image, this should be true.
  6. Method should be a string that indicates the HTTP request method that will be used (e.g. 'HEAD'). If provided, its value is stored in the method attribute and is used by get_method(). The default is 'GET' if data is None or 'POST' otherwise. Subclasses may indicate a different default method by setting the method attribute in the class itself.

Following is the code demonstration of using the above class.

Calling urlopen with this Request, the object returns a response object for the URL requested. This response is a file-like object, which means you can for example call .read() on the response:

>>> import urllib.request

>>> req = urllib.request.Request('')

>>> with urllib.request.urlopen(req) as response:

. . . .   the_page =

>>> print(the_page)


It just simply return the HTML content of the URL parsed.

Note that urllib.request makes use of the same Request interface to
handle all URL schemes. For example, you can make an FTP request like so:

>>> req = urllib.request.Request('')

In the case of HTTP, there are two extra things that Request objects allow you to do: 

  1. First, you can pass data to be sent to the server.

  2. Second, you can pass extra information ("metadata") about the data or about the request itself, to the server - this information is sent as HTTP "headers".

A briefed info about these two parameters would be discussed in the next article.


Follow-up Article: