How To Splunk-IT: Basic Searching

Splunk is a software mainly used for searching, monitoring, and examining machine-generated Big Data through a web-style interface. It performs capturing, indexing, and correlating the real-time data in a searchable container from which it can produce graphs, reports, alerts, dashboards, and visualizations.

One of the most important concept in Splunk is: Searching. Data can get big real fast, and almost always we are looking for something specific from that pile of data. That’s why it’s extremely important to have the skill to be able to find what you’re looking for.

However to understand the concept of Searching, first there are a few more that we have to make clear.

How Splunk works?

Behind the base of the whole Splunk machinery there are 3 soldiers carrying the load, known as Forwarders, Indexers and Search Heads

  1. Forwarders are used for collecting data and forwarding that data to another Splunk instance called Indexer.

Forwarders are actually one way to inject data into Splunk. There are two more ways. One is by uploading data and the second one by monitoring data from a system.

Getting Data In

2. Indexers are the place where the data is stored. An indexer categorizes and applies metadata to data. The process which happens is it takes raw data from forwarders, turns it into events, places results into an index, which is stored in a bucket.

3. Search Heads are used to access the data stored in the Indexer and let you do analysis, visualization. They basically are the tool that lets you interact with your data.

Having this said, on of the most important place to be in Splunk is in the Search and Reporting App, an interface that has this look:

Search and Reporting Application

Once you find yourself in this place, with all your data in, you can start splunkin and doing #basic_searches or #advances_searches, however, the idea is to narrow that huge amount of data, to exactly the data that you need.

Searching in Splunk

To be able to do any kind of search in Splunk you need to be familiar with a structure of a command that you can type in the search bar. The basic structure of a Splunk search is:

Search Structure

Splunk automatically assigns metadata to the entire source unless you specify what the metadata should be. The default assignments are as follow:

Metada Information

Besides that, the basic things you need to know to construct searches are:

The Splunk vocabulary, which consists of:keywords - predefined words 
phrases - multiple keywords
fields - key-value pairs
wildcards - the words with asterisk* that allow you to find something even when you don't really know what you're looking for
booleans - AND, OR, NOT
The most common commands that are used are:chart/timechart - tabular output for charting
rename - renames a specific field
sort - sorts results by specific field
stats - provides statistics
eval - calculates an expression
dedup - removes duplicates
table - builds a table with specific fields

Now, knowing this, here are few examples to give you an idea of what on earth i’ve been talking so far.

I am using a publicly available dataset that contains data from car crashes and I have that imported in Splunk.

Example #1: Let’s take the data from the “incident-crashes” instance and calculate how many drivers were under substance when an incident happened by Municipality.

Basic Search Example

The commands reassembled mean the following:

host="incident-crashes" - the forwarder from which we want to take data from is named "incident-crashes"Municipality=* - i want all Municipalities included| - this is just a pipe operator with which we say take the output before the left side of the pipe operator and treat it as an input on the right side of the pipe operatorstats- this is that command for creating statistics from the data and here we said, give me the number of drivers that were under substance for each municipality.

Example #2: Here we are trying to search data from incident-crashes source where the field Direction is either South or North, the Municipality starts with CHE* and for the incident At Fault was the driver, then after taking that data we want to create a table with the default field in Splunk — _time as first column, Direction and Municipality and then take that and format the time a little better by using the eval function

Of course there is a lot more to Searching in Splunk. These are just basic searches, so to get things more complicated feel free to experiment with different datasets and construct a lot of queries to master the search process.

More information about Splunk can be found here.

soc analyst serving tech bites as articles.