SPLK-1003 Splunk Enterprise Certified Admin – Splunk Post Installation Activities : Knowledge Objects
- Uploading Data to Splunk
We will be seeing more about post installation, that is, configuration steps that are carried out in Splunk. Throughout this module we’ll be using three components of Splunk which are hosted in our Amazon AWS Data Index search and our Universal Forwarder which is as part of our local installation which will simulate a real time experience of sending the logs from our local PC to our cloud instance that are hosted, searched and the indexer. Throughout this course we will be seeing somehow common and most important steps that are carried out by Splunk, admin or architect. Some of the activities that we’ll be discussing are how to get data inside Splunk and configuring source types source Index Creation fields extraction which is the most important part of our Splunk, similarly other knowledge objects like tags, event types, lookups and macros. And also at the end we will be creating some of the sample alerts, reports and dashboards.
Now let us jump right to our first topic of discussion that is getting our data inside Spunk. We’ll be using Tutorial Data which is available for basic practicing of search, that is Tutorialdata Zip. You can click the link which is specified and it will take you directly to the link where you can download this. Tutorialdata zip and prices CSV zip. Once you have downloaded these two files, that is Prices CSV and Tutorial Data, we can extract it and upload it as an individual file or extract it and feed it through a forwarder. This would be a better lab exercise when you get access. Free lab access. Upon purchase of this complete course here we will be uploading Zip file directly to our Splunk instance that is the searcher and analyze how the data is being indexed automatically by Splunk. To upload this data, first let us log into our searcher. This is our searcher.
Yes, we have logged in to upload this data that we have just downloaded. Go to settings. There is an add date. Click on Add Data. Once you have this, click on Upload. We have downloaded and we have a Zip file and we can follow this on screen instructions to finish our data uploading of Tutorial data, I’ll select the files which we have downloaded recently, that is Tutorialdata Zip and by this method you can upload a maximum of 500 MB. So make sure this method is not recommended for indexing your data throughout the organization. Because we can’t practically dump every data and start uploading them one by one to Splunk. But this can be used for regular troubleshooting purposes and also to verifying the field extractions and how Splunk is breaking the events. Let us follow this on screen instruction. I selected tutorialdata. Zip. I’ll click next I’ll select Source Type to automatically determine by Splunk and I’ll set the host field manually. That is our search manual Upload.
This is just for our understanding. You can give whatever the host value you need because we are just uploading the data. We are not collecting this data and I’ll keep it default. That is nothing but your main index. I’ll select the main index and if you need to create it. A test index. We have seen how to create these indexes in our previous so I’ll create a test index. I’ll save it so that whatever the data will be uploading now using this method will be into test index, where we will be testing the uploaded data for parsing of the data field extractions. And what all the fields? That needs our intervention to parse it successfully mention constant value. You can also use regular expression path in case if the host value is present as part of your log data. And also you can mention if you have complete location. Let’s say something like see Downloads and you have Hostname as part. Of your complete location, you can mention which segment. For example, if we consider this one the downloads, our segment number will be three. For now, let us keep it constant value. You have different methods of choosing this, so let us leave it for manual upload. Since we’ll be uploading this manually and testing, not as a recommendation. We’ll see how we can get the same logs from our agents. That is universal forwarder and other methods. Click on review? Yes, everything seems to be fine.
Click on Submit so once we click on submit the file will be uploaded into our searcher after uploading, let us click on start searching once you have uploaded, you will be able to see that we’ll be able to search instantaneously, like we uploaded probably seconds ago. And already the data has been indexed by Splunk and it’s able to understand all this information, extracted this much information from our logs, which is pretty good. We have this many information from our logs which we uploaded as part of a zip file. So zip file, you’ll be able to see this is our source.
And inside the zip file, these are the files it has found and index successfully. And it has found three types of source. That is source types. One is secure. It is SSL logs access. That is nothing but your web server logs and application logs. Something named as vendor sales. We’ll see how this has been broken down into pieces. And always remember, uploading the data to this week can only be used for learning purposes and for small amount of data. That is less than 500. MB. Whereas if the data is large and continuous like in the typical real time scenarios, we will be using universal forward up now we have successfully uploaded our test data.
- Adding Data to Splunk via configuration file edit
Now we have uploaded the file successfully. We’ll now see how to collect the logs from our universal forwarder and this is how it is typically done in most of the organization we will have to edit Inputs. com file or add the inputs using Splunk CLI we’ll see both the methods. First method we’ll be editing configuration files to mention the location of the folder or file to which Splunk universal forwarder should continuously monitor or collect the logs. For that we will be using our local laptops universal forwarder. I’ve already extracted where is my downloads?
Yes, these are my downloads and I’ve already extracted this tutorialdata zip that is this folder and I’ll be monitoring this folder on this laptop for continuous changes. So now we know that. Let us see how we can add this data using Inputs. com file. First let us go to our inputs. com on our universal order that is C program files splunk universal order this is our Splunk home etc system local. Here we already have Inputs. com file, we’ll edit the same. Here I have one syntax that is monitor followed by location of our path where we need to monitor. This is our path which we need to monitor consistently for the changes I’ll mention as it is okay I need to open or I need to open this file as an admin to edit. Let me quickly do that. Here we will add this complete path.
Yes, now we have added this path I’ll just add and monitor. We’ll see other parameters at the later stage monitor and the location of which I need to collect the locks from. Since we have edited the configuration, let’s go ahead and restart our Splunk universal forward. This is my universal forward that is Splunk exe and restart. Once it has been successfully restarted we should be able to see these logs in our search at probably it should not take more than a minute or so to get these logs into our search ad. Now we know on which host we are getting logs because let me show you I’ll search for all the indexes but the host is this one because in our Inputs configuration this is the host defined and this is where we are getting the logs from. So I’ll add a host filter and I’ll see for probably last 24 hours. Yes, so as you can see we have got these many events in the last 24 hours, that is 1500 events in the source. It’s mostly Windows event logs and stuff and we’ll see if we can get our newly added monitor of this tutorial data. Now as you can see we are getting different types of source. These are the default windows during the installation of universal for order we started collecting let me change this to all time so that our tutorial data is a little bit backdated so that we’ll be able to see all this information. As you can see now we have different source type. We got our access log, secure log, vendor sales, all the logs which were mentioned as part of our tutorial data location.
- Adding Data to Splunk via Splunk CLI
As you can see now the source as complete path which we edited that is the monitor the complete location up to the log file always the source contains by default the complete log path from where this data has been fest. You can rename it. We’ll see in further tutorials how to set this source source type and other information and how we can overwrite them to add more value to our Splunk installation. So in this tutorial, we have seen one that is adding data using unitsel forwarders through editing configuration. Now let us see how we can add configuration using CLI. This process is relatively similar to both Windows and Linux. I’m in bin directory of my. Installation. I’ll be using Splunk exe add monitor. This is the same configuration that we added and the directory for which we’ll be monitoring. Let me copy that. For testing purposes, I’ll be adding tutorial data.
One at the end. So that we have a new configuration updated on the Splunk Universal forwarder so that we’ll be able to see how this configuration has been added. It is asking for my Splunk universal forwarder credentials. It says the path does. Not exist. It is able to identify that there is no such path called as tutorial data one. So let us create one. I created the path. Let me try to add it again this time it says it has successfully added the monitor path tutorial data one once we have added successfully you should be able to see the inputs. com similar to this one where it has clearly mentioned that Tutorial data one let us see now are we getting our tutorial data one? We should get it that should not be an issue but this is how the similar practices are used for adding the data in large organization that is using CLI method or by editing configuration files directly that is inputs conf.
- Validation of On Boarded Data
We have learned two methods or uploading data into Splunk. That is one via web that is directly uploading the package which is less than 500 MB into Splunk is one way of getting your data inside Splunk. The second is using units forwarder. We saw two methods. That is one, editing configuration file. Second is the CLI. The third method is deploying the new data collection or the configuration from your deployment server. We’ll be seeing this sooner in the next module. We can collect data using Deployment server, that is Deployment Server deploys the configuration that we edited. That is inputs conf to our unit cell forwarder without logging into our universal forwarder machine.
Just keep in mind so that it will be useful during our next discussion about our deployment server. Now we have uploaded the data into Splunk. Let us verify by using some of the basic search operation. This is our data. We have all the data inside Splunk. Now let me type Access Log source type I’ll do a wildcard search for Access log that is Access Star. I’m searching for all the index presently, but we have only one index. That is default main index. And we have used test index for uploading the data. But throughout this tutorial, we’ll be using either main index or other custom created indexes like Windows and Linux. I’ve written three filters here on the search bar.
That is index. Let me narrow it down to Main instead of wildcard. So main Index the universe will forward a PC name followed by the source type of logs which I want to access. As I can see, there are huge information in this access log. One, it is the source IP. Second is the time. Make sure all these events are passed properly so that once you see this event, you should be able to break down each event into respective fields.
Here, if we are not able to break down each event into some of the fields or respective fields, that means the parsing is not proper and we need to work on a parsing. One of the simple examples, we can probably use a command known as IP location and we can mention our IP address. The field name is what is our IP address? Field name the client IP. To get the location from where our customers or visitors are visiting our site, I’ll use Top Country. Using just the IP address, we’ll be able to determine the country from where they are accessing our sites. As you can see, 26% of the overall traffic comes from United States and like one to 2% of traffic comes from Germany and Brazil. Quickly, you can also so visualize them based on whatever the recommended patterns you choose, you’ll be able to visualize them and present it. This confirms that we are able to pass these logs without any issues. And we have all our information inside Splunk using our universal forwarder.
- Source Sourcetype and Host Configuration
Now we have got the data inside Splunk, let’s see how we can modify some of the configuration to make our analysis on the data much more meaningful and efficient. The next method after getting data is to enrich the data by adding meaningful information that makes sense rather than plain old host names and IP addresses. The first field that is part of our default selected field is our source type. If you are not getting what are selected fields, I would recommend you revisit our first chapter to understand these default fields or selected fields and also about interesting fields. The source type can be renamed by using Inputs conf at the time of deploying the configuration or at the time of parsing the data at the indexer level. The source type fields always holds information about technology. Let me show you some example like how the source type index is equal to main source type.
Or let me search all the logs for the last 24 hours. As you can see the source type these are the source type. The source types can be renamed by using our Inputs conf file at the time of deploying the configuration or at the time of parsing the data at our indexer level. The source type fields always holds information about technology of the logs or the application that is used for generating these logs. And the source type are used extensively for filtering the logs during searching. If I search on using source type equal to Performance CP load, this is used as a filter not only now or not only for these logs. This will be used throughout your Splunk experience for filtering the logs extensively. And this is one of the important fields which you will have more meaning to your data while configuration and this is also one of the default configuration that should be present for any data that is fed to Splunk.
We will see small lab exercise in that you will be able to get a clear picture on how you can set the source type or rename source type and we use that information for searching or narrowing down specific logs. Since we have a universal forwarder installed in our local PC, let us go to Inputs. com I’m in Splunk home Splunk Universal Forwarder etc. System Local let me open up Inputs confide. This is our Inputs confile which has a couple of examples we have added earlier.
I’ll just remove this. We have removed this. Here I can add host is equal to new host value which makes sense, and source type is equal to new source type and also source is equal to new underscore source. This is how you typically define the new values. Let’s say I’ll say host is equal to my universal forwarder and laptop. Instead of it saying a regular name like Arum Kumar PC, it can say my universal forwarder installed on the laptop which makes much more sense. Similarly, instead of saying some dumb source type, it can say it will consist the tutorial data that we have uploaded which will make more sense. Similarly, source will be my downloads location. This will add much more meaningful and less confusion into your data so that you can analyze your data much better.
So let us save this configuration restart our universal forwarder. Once it is restarted, let me make some changes so that our continuous monitor picks up the new changes I’ll make under access log I’ll just copy and paste some of the lines so that the timestamp varies and we get our new entries so it will be under all time because our access logs are backdated. So since we can see only top ten values here, let me see just rare source type since we have a couple of events. As you can see our newly added source type that is tutorial data popped up in order to narrow down this we can use host is equal to the new host value that we have set.
As you can see we have eight events that was duplicate values that we copy pasted and it was successfully renamed tutorial Data, source and Hostname. Instead of having the complete location, this makes much more sense. Similarly, source type just having the access log or just secure law. It makes sense to Splunk saying that to pass the logs but as an analyst or a data when you are searching on Splunk, it makes much more sense to provide specific terms that any user who is accessing Splunk will be able to understand much better. This configuration can also be deployed from our deployment server. We will soon be seeing how we can deploy deployed this configuration using our deployment.