Python Pandas Introduction

Python Pandas Introduction


In computer programming, pandas is a software library written for the Python programming language for data manipulation and analysis.According to the Wikipedia page on Pandas, %u201Cthe name is derived from the term %u201Cpanel data%u201D, an econometrics term for multidimensional structured data sets%u201D. Pandas is a high-level data manipulation tool developed by Wes McKinney. It is an open source, free to use (under a BSD license) software. It is built on the Numpy package and its key data structure is called the DataFrame. 

DataFrames make manipulating your data easy, from selecting or replacing columns and indices to reshaping your data. It allows you to store and manipulate tabular data in rows of observations and columns of variables. There are several ways to create a DataFrame. DataFrames can be created by: 

  • Converting a Python%u2019s list, dictionary or Numpy array to a Pandas DataFrame.
  • Opening a local file using Pandas, usually a CSV file, a JSON file, an Excel Sheet, etc.
  • Opening a remote file or database like a CSV or a JSON on a website through a URL or read from a SQL table/database.

For example:

  • Dictionary to DataFrame:

#! /usr/bin/env python3 # Dictionary dictionary = {"country": ["Brazil", "Russia", "India", "China", "South Africa"], "capital": ["Brasilia", "Moscow", "New Dehli", "Beijing", "Pretoria"], "population": [200.4, 143.5, 1252, 1357, 52.98] } # Import pandas library import pandas as pd # Converting Dictionary to DataFrame and saving it in a variable df df = pd.DataFrame(dictionary) # Print df print(df)

  • CSV to DataFrame:

#! /usr/bin/env python3 # Import pandas as pd import pandas as pd # Import the country-codes.csv df = pd.read_csv('country-codes.csv') # Print out df print(df)