Python UrlLib - How to deal with 403 Forbidden Error














































Python UrlLib - How to deal with 403 Forbidden Error



Python UrlLib - 
How to handle 403 Forbidden error



Introduction-


In this article we will look into the reason behind getting a 403 Error and the steps through which you can deal with this error.


403 Forbidden Error-


1. This error arises when you successfully make a connection with the server, but the server chooses not to respond to these requests.

2. The reason server chooses to do so is that the several website owner do not want a bot or a program to access their website, they want only real-users to use their services.

3. Hence when to you try to post some data using a python program,many websites detect that it a computer program and not a real user trying to login and refuse to respond to your request.
In this article, we will show you how you can bypass that detection and access the site.

Example-

Consider the following code which will gives a 403 Forbidden, we try to post a search request on Google using our python program.

import urllib.request
class Post():
    def __init__(self, url):
        self.url = url
    def post_method(self):
        try:
            req = urllib.request.urlopen(self.url)
            print(req.read())
        except Exception as e:
            print(str(e))
def main():
    url = 'https://www.google.com/search?q=test'
    post_object = Post(url)
    post_object.post_method()
    
if __name__ == "__main__":
    main()


Output-

HTTP Error 403: Forbidden

As mentioned before, the site recognises that a computer program to trying to post and raises a 403 error.

How to overcome the error-


1. In order to perform a post request we have to make the site believe that a real-user is trying to access the site and this can be done using the user-agent while trying to make a post request.

2. The user-agent contains information about the browser along with other information that lets you make that post request. We store the user-agent in a dictionary and provide it to the headers parameter of the Request class.

3. The following code will give you an idea about how to make the post request.

Code-

import urllib.request
class ImprovedPost():
    def __init__(self, url, headers):
        self.url = url
        self.headers = headers
        
    def improved_post_method(self):
        try:
            #Making the post request
            request = urllib.request.Request(self.url, headers = self.headers)
            response = urllib.request.urlopen(request)
            
            #Reading the response from the site.
            data = response.read()
            #Writing the response in string format in ResponseData file            
            response_file = open('ResponseData.txt', 'w')
            response_file.write(str(data))
            response_file.close()
            
            print("Data Successfully Saved!")
            
        except Exception as e:
            print(str(e))
            
def main():
    #URL we want to access
    url = 'https://www.google.com/search?q=test'
    #The user-agent stored in the headers dictionary
    headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.89 Safari/537.36'}
    post = ImprovedPost(url, headers)   #Creating the object
    post.improved_post_method()         #Calling the method
    
if __name__ == "__main__":
    main()

Output-

Data Successfully Saved!




Fig-1


Fig-2

As you can see we have a text file saved in the location as the program, the text contains the entire source code of the url we made a request to.

We have successfully downloaded the source code in the url given, we store the source code in a text file in the location as the program. 

You can see the headers dictionary has the user-agent stored in it, you can get the user-agent using the following steps.
1. Open your browser and inspect the web-page(right-click + inspect).

2. Go to the network section and click on the first request that occurred.

3. In the headers section, scroll down in the end you will be able to see the user-agent just copy and paste.


So with these steps, now you can deal with 403 Forbidden Error efficiently.


More Articles of Siddhesh Borkar:

Name Views Likes
Python MySQLdb - Creating Triggers using Python 3654 2
Python MySQLdb - Writing Dynamic SQL Query 2159 2
Python MySQLdb Introduction 392 2
Python UrlLib - How to access the Web using proxies in Python 1393 2
Python How to Set and Get Environment Variables 568 2
Python UrlLib - How to Log-in to a website using urllib 1843 2
Python MySQLdb - Retrieving images from a MySQL table. 751 2
Python UrlLib :- Parse a website using re and urllib 439 2
Python How to Upgrade PIP 291 2
Python Set Environment Variables 393 2
Python MySQLdb - Creating a database table using MySQLdb. 422 2
Python UrlLib:- urllib.parse Module 637 2
Python UrlLib - How to scrape images from a Website 997 2
Python UrlLib Introduction 370 2
Python MySQLdb - Deleting data using MySQLdb 356 2
Python MySQLdb - Updating data into a database. 402 2
Python UrlLib - How to deal with 403 Forbidden Error 3121 2
Python - PIP Remove Package 498 2
Student Management System with database connectivity using tkinter 7078 3
Python UrlLib - urllib.robotparser Module 479 2
Python MySQLdb - Dropping table using MySQLdb 364 2
Python UrlLib :- urlencode Method 429 2
Python UrlLib Module- How to scrap links from a Webpage 501 2
Python MySQLdb - Storing Images in MySQL as a Blob 2195 2
Python UrlLib :- How to download files from the Internet 960 2
Python How to Upgrade PIP3 1664 2
Python UrlLib-urllib.error Module 412 2
Python MySQLdb - Orderby and Limit Clause 442 2
Python Get Environment Variable 442 2
Python UrlLib- Downloading Images as JPG from URL. 871 2
Python MySQLdb - Read Operation using MySQLdb 362 2
Python MySQLdb - Inserting data into a database table using MySQLdb. 386 2
Python Reading Environment Variables 575 2
Python MySQLdb - SQL Joins 554 2
Python How and When to use __str__ 510 2
Python MySQLdb - Performing Transactions & Handling Errors. 510 3
Python UrlLib- urllib.requests Module 498 2

Comments