DATA STRUCTURE IN PANDAS
In this tutorial, we will learn about data structure in Pandas, the three main data structures in Pandas are Series, DataFrame and Panel.
What is Data Structure in Pandas?
One of the most important things in Pandas is to understand the data structure that it has, once you have mastered it then you can understand how Series, Dataframe and Panes are divided. Pandas is divided into three data structures when it comes to dimensionality of an array. These data structures are:
- Series
- DataFrame
- Panel
Data Structure | Dimensions |
Series | 1D |
DataFrame | 2D |
Panel | 3D |
Series and DataFrames are the most widely used data structures based on the usage and problem solving sets in data science. If we look at these data structures in terms of a spreadsheet then Series would be a single column of an excel sheet, whereas DataFrame will have rows and columns and be a sheet itself. Panel will look like a group of sheets which can have multiple DataFrames.
Series Data Structure in Pandas
As we have learned, series is a one dimensional data structure that is capable of handling or storing any type of data be it string, number, integer, float, objects, etc. Series contains just one axis i.e. of a column, as that axis is labelled as the index of the series.
The syntax of series is:
pandas.Series( data, index, dtype, copy)
DataFrame Data Structure in Pandas
Dataframe in pandas is one step ahead of series (since it is a one dimensional data structure). Dataframe is a 2D data structure having labelled axes as rows and columns. In order to create a dataframe, we need to always work around three main aspects:
- Data (Source to populate our dataframe with)
- Rows (Horizontal wise)
- Columns (Vertical wise)
The syntax of Dataframe is:
pandas.DataFrame( data, index, columns, dtype, copy)
Panel in Pandas
Panel in pandas is used for working with 3-dimensional data. It is not used that much in real world examples. But, let’s say that you have sets of dataframes and you want to analyze all of them in one go, then you can use the option of panel in pandas.
The syntax of panel is:
pandas.Panel(data, items, major_axis, minor_axis, dtype, copy)
Read more about the depreciation of panels in pandas here