Introduction of Wikipedia Module in Python Wikipedia is a Python library that makes it easy to access and parse data from Wikipedia.Wikipedia is a multilingual online encyclopedia created and maintained as an open collaboration project by a community of volunteer editors using a wiki-based editing system.
In this article, we will see how to use Python%u2019s Wikipedia module to fetch a variety of information from the Wikipedia website.
Installation:
In order to extract data from Wikipedia, we must first install the Python Wikipedia library, which wraps the official Wikipedia API. This can be done by entering the command below in your command prompt or terminal:
pip install wikipedia
Getting the summary of any title:
Summary of any tittle can be obtained by using summary
method.
Syntax : wikipedia.summary(title, sentences)
Argument :
Title of the topic
Optional argument: setting number of lines in result.
Return : Returns the summary in string format.
Code:
# importing the module
import wikipedia
# finding result for the search
# sentences = 2 refers to numbers of line
result = wikipedia.summary("India", sentences = 2)
# printing the result
print(result)
Output:
India (Hindi: Bh?rat), officially the Republic of India (Hindi: Bh?rat Ga?ar?jya), is a country in South Asia. It is the seventh-largest country by area, the second-most populous country, and the most populous democracy in the world.
Searching title and suggestions:
Title and suggestions can be get by using search()
method.
Syntax : wikipedia.search(title, results)
Argument :
Title of the topic
Optional argument : setting number of result.
Return : Returns the list of titles.
Code:
# importing the module
import wikipedia
# getting suggestions
result = wikipedia.search("secret", results = 5)
# printing the result
print(result)
Output:
['secret', 'The secrets', 'secrets(disambiguation)', 'Book of secret', 'secret and lies']
Retrieving Full Wikipedia Page Data
In order to get the contents, categories, coordinates, images, links and other metadata of a Wikipedia page, we must first get the Wikipedia page object or the page ID for the page. To do this, the page()
method is used with page the title passed as an argument to the method.
Look at the following example:
wikipidea.page("ubuntu")
This method call will return a WikipediaPage
object, which we'll explore more in the next few sections.
To get the complete plain text content of a Wikipedia page (excluding images, tables, etc.), we can use the content
attribute of the page
object.
print(wikipidea.page(("Python").content)
Output:
Python is an interpreted ,high-level,general-purpose programming language. Creared by guido van Russom at 1991.Python's design philosophy emphasizes code readability with its notable use of significant whitespaces. Its language constructs and object oriented approach aims to help a programmer write clear code.
Comments