PLOTTING IN PANDAS

In this tutorial we are going to learn about the in-built pandas plotting function which is used for visualizing data in various graphs in pandas with the help of matplotlib and a dataframe.


Plotting in Pandas

Plotting in Pandas

We can apply different types of plots in pandas in using the matplotlib library which specializes in visually representing the analyzed data. Pandas has an inbuilt feature of plot which has a following syntax:

Syntax

df.plot(
x=None,
y=None,
kind=’line’,
ax=None,
subplots=False,
sharex=None,
sharey=False,
layout=None,
figsize=None,
use_index=True,
title=None,
grid=None,
legend=True,
style=None,
logx=False,
logy=False,
loglog=False,
xticks=None,
yticks=None,
xlim=None,
ylim=None,
rot=None,
fontsize=None,
colormap=None,
table=False,
yerr=None,
xerr=None,
secondary_y=False,
sort_columns=False,
**kwds,
)





If you are using jupyter notebook then just import the following libraries to start in Pandas:

Series Plotting in Pandas

We can create a whole whole series plot by using the Series.plot() method. This type of plot is used when you have a single dimensional data available. The example of Series.plot() is:

import pandas as pd
import numpy as np
s1 = pd.Series([1.1,1.5,3.4,3.8,5.3,6.1,6.7,8]) 
s1.plot()

Series Plotting in Pandas

Series Plotting in Pandas – Area Graph

We can add an area plot in series as well in Pandas using the Series Plot in Pandas. This type of series area plot is used for single dimensional data available. The example of series area plot is:

import pandas as pd
import numpy as np

series1 = pd.Series(np.random.rand(10))
series1.plot.area()

Series Plot Area

Scatter Plotting in Pandas

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

df = pd.DataFrame({'Name':["Hira", "Smith", "Laura","Alex"],
'Age':[23, 34, 21, 23],
'Gender':['f','m','f','m'],
'State':['California','Chicago','Florida','Texas'],
'Grades':[78,90,87,71]})

df.plot(kind='scatter', x='Age', y='Grades')

Output:

Scatter Plotting in Pandas

Bar Plot

df.plot(kind='bar',x='Name',y='Age')

Output:

Bar Plot

Pie Plotting in Pandas

Pie plot is used for displaying portions or slices of data inside a circle. We are able to achieve that by using the matplotlib function known as dataframe.plot.pie() for a particular column. If no column name is provided then we use the subplot=True attribute to draw each numerical data on its own.

import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.rand(4), index=['eating', 'sleeping', 'studying', 'working out'])
df.plot.pie(subplots=True)

Output:

Pie Chart

Box Plot

A box plot is a way of visually representing different groups of numerical data in quartiles. The box starts from Q1 until Q3 quartile and analyses the values with a middle line which is used for calculating median. The whiskers at both the end of the box are there to present the data range. Outliers are the points that are present beyond the whiskers.

import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.rand(10, 4), columns=['Oil', 'Gas', 'Diesel', 'Benzene'])
df.plot.box()

Box Plot