Python Urlretrieve

Python Urlretrieve


As we have previously seen that urllib.request is a Python Module to open and fetch the HTML content of any URL.

A simple code demonstration to open any URL using urllib.request.urlopen()

import urllib.request

with urllib.request.urlopen('') as response:

   res =


Printing out the res would print out the HTML content of the entire URL.

But what if we want to extract resources (for example we want to fetch all the images present in a particular URL)?

For this purpose, Urllib.request module contains a method known as urlretrieve that can be used to extract the resources out of any URL over the web.




  • url: Pass in the URL whose content needs to be fetched
  • filename: If a filename is passed, it is used as the temporary file location. If a filename is passed and the URL points to a local resource, the result is a copy from a local file to a new file.
  • reporthook: The reporthook argument should be a callable that accepts a block number, a read size, and the total file size of the URL target.
  • data: The data argument should be valid URL encoded data.

Note : urlretrieve() will raise ContentTooShortError when it detects that the amount of data available was less than the expected amount (which is the size reported by a Content-Length header). This can occur, for example, when the download is interrupted. 

Now let's see the working of urlretrieve through the following code example.

We are going to fetch the top 3 images of Baby Yoda from the subreddit r/BabyYoda.

from bs4 import BeautifulSoup
import urllib.request

# Setting URL destination
url =

# Fetching Url
response = urllib.request.urlopen(url)

# Checking status code (if you get 502, try rerunning the code)
if response.getcode() != 200:
f"Status: {response.getcode()} %u2014 Try rerunning the code\n")
f"Status: {response.getcode()}\n")

# Using BeautifulSoup to parse the response object
soup = BeautifulSoup(

# Finding Post images in the soup
images = soup.find_all(
"img", attrs={"alt":"Post image"})

# downloading images
number =
for image in images[:3]:
image_src = image[
urllib.request.urlretrieve(image_src, str(number))


Images would be saved in a temporary location is your disk.

Status: 200

Image 1:

Image 2:

Image 3: