Splunk Data Ingestion: Splunk Tutorial
In this tutorial, we are going to learn about the Splunk Data Ingestion by going through the process of uploading the file successfully on Splunk and making it ready for analysis and visualisation.
What is Data Ingestion?
Data ingestion is a process through which data is transferred through from one point of source to another and from there on it can be stored and considered for for further analyzing. The data that is transferred during the process of data ingestion could be coming from any format like DBMS, RDBMS, files like CSVs etc. It is preferred to clean and munge the data to make sense upon analyzing it; otherwise the data won’t make sense.
Data Ingestion in Splunk
Data ingestion is possible in Splunk through the Add Data option that is available when the user clicks on ‘Explore Splunk
Enterprise’. Once you drag down the Explore Splunk Enterprise option; you will see the ‘Add Data’ option on second from the left. Once you select it, you can see that Splunk redirects you to a screen displaying three options to import data. These options are:
- Upload
- Monitor
- Forward
Upload
This option is used for importing the data from the our system, i.e. the files present in our computer.
Monitor
Let’s say that we want to use data from an external source like: website, app etc then we use this option.
Forward
Forward option is handy when we have to deal with the incoming data to visualize it and Splunk forwards that forwarded data.
Getting Data
To get data, we will use the ‘Upload’ feature out of the three options. You can also get data from the Splunk website, it’s free, you can find more about the free to use data here:
Similarly, if you are a Kaggle user then you can get some of the most intriguing datasets for analyzing and visualizing data. For this tutorial, we have downloaded the titanic.csv file that is uploaded as a ‘.zipped.’ file into Splunk.
The data downloaded from these sources will be a zipped file. We have to extract these files on the local system and then upload it to use. In our case, we are going to use titanic.csv file and upload it through multiple steps.
Select Source
Just click on the ‘Upload’ option and select ‘titanic.csv’ as your file by clicking on the Select File button as you can see below.
Source Type
After clicking on the Select File option, you will see the uploaded file and now you will be redirected to the Source Type, which will show us the uploaded file ‘titanic.csv’.
You can see above that the titanic.csv file is uploaded and reading for the analysis in Splunk. On the left hand side, you can see a button called Source Type which is set to csv as the system has already identified the file format itself, you can change the file format later on as well if you want to upload data of that format. Next you will go to Input Settings
Input Settings
Input Settings consist of additional settings that are further classified into:
- Host Field Value
- Index
Host Field Value
This is the name of your computer. It is a must to have the same name in this field as of your machine. So, make sure you have your computer’s name in there.
Index
This part refers to as how the data is stored inside the Splunk platform, if you want to keep it as usual then leave the index value as default.
After going through the input settings, you can go the next section which is Review.
Review
Review section shares the summary of the configuration and settings that we have done or changed so far, making it even more easier to see everything in a glimpse.
Once you read through everything, then you will hit the submit button and see a message saying: File has been uploaded successfully.
Boom! You have just successfully uploaded your 1st data in the Splunk platform. Now go on hitting and exploring all the different fields in your IDE and start searching and visualizing your data.
Now you are ready to explore your data, whether you analyze it or visualize it or do both, Splunk platform is ready to work with your data!