IMPORTING CSV IN PANDAS
In this tutorial, we will learn about Importing CSV module in Python Pandas. We will learn about how to import and work with the csv files.
Python is the best choice for performing data analysis mainly because of amazing availability and integration of pandas. Pandas is a complete package that can help you import and read data much faster and easier by using a CSV file.
We can import csv (comma separated values) files by using a method in pandas known as read_csv. We need to import csv files because sometimes we might have to work with big size datasets for analysis. So, a common format of containing all that data is CSV.
The Syntax for read_csv is:
pd.read_csv(filepath_or_buffer, sep=’, ‘, delimiter=None, header=’infer’, names=None, index_col=None, usecols=None, squeeze=False, prefix=None, mangle_dupe_cols=True, dtype=None, engine=None, converters=None, true_values=None, false_values=None, skipinitialspace=False, skiprows=None, nrows=None, na_values=None, keep_default_na=True, na_filter=True, verbose=False, skip_blank_lines=True, parse_dates=False, infer_datetime_format=False, keep_date_col=False, date_parser=None, dayfirst=False, iterator=False, chunksize=None, compression=’infer’, thousands=None, decimal=b’.’, lineterminator=None, quotechar=’”‘, quoting=0, escapechar=None, comment=None, encoding=None, dialect=None, tupleize_cols=None, error_bad_lines=True, warn_bad_lines=True, skipfooter=0, doublequote=True, delim_whitespace=False, low_memory=True, memory_map=False, float_precision=None)
Step 1: Make Sure You Know The Filepath
The first important step of importing csv in pandas is to know where the csv file is stored. It can store on your personal computer or it can be available on the internet in the form of a url with an extension of ‘.csv’. For example on pc, it might be stored as:
Firstly, capture the full path where your CSV file is stored. In my case, the CSV file is stored under the following path:
Step 2: Apply Code to Import CSV in Pandas
import pandas as pd df = pd.read_csv ('filename.csv') df
Other Steps: You Can Choose Your Own Columns
Now let’s say that you want to select a bunch of columns of your own choice within your CSV file when you import it. Let’s say that in the sample file we are using; we only need 3 columns for data analysis. We can achieve that by using the column attribute. Now what if you want to select a subset of columns from the CSV file?
import pandas as pd data = pd.read_csv ('filename.csv') df = pd.DataFrame(data, columns= ['Region','Country', 'Total Profit']) df
You can apply the head() function as well in order to print the first 5 rows of your dataset from the csv file.
import pandas as pd data = pd.read_csv ('filename.csv') data.head()