Using the Stats Command in Splunk to Bend Data to Your Will
Corero’s DDoS Analytics App for Splunk Enterprise leverages Splunk software for big data analytics and visualization capabilities that transform security event data into sophisticated dashboards. For those who use Splunk, this blog will explain some real-world, everyday uses of the application. As you read through the stats commands shown below, keep in mind that these commands are being done on created example data as actual Corero events are much more detailed.
How to Structure Splunk Data
When using Splunk, the key to showcasing your data or unearthing hidden correlations is understanding the stats command returned results, and molding those results to suit your needs. For example, Figure 1 below is a Splunk dashboard of some packet data. The data consists of 15 events. Depending on the how the stats command is used, different views of the same data can be visualized.
To simply count the events: stats count
This counts the events and gives a one row, one column answer of 15.
The stats command can count occurrences of a field in the events.
To count the events, count the events with a dip (destination IP) field, and count the events with a dprt (destination port) field: stats count count(dip) count(dprt)
Notice that the count(dprt) is one less, this is because one of the events does not have a dprt field (it is an ICMP packet). All the counts appear on the same row, this is important in future operations and when comparing data.
The stats command also allows counting by a field, when this is done a row is created for every distinct value of that field.
To count the number of events per dip: stats count by dip
There are four different IP addresses in the data set so four rows are created. If an event did not have a dip field, it would NOT be listed.
Multiple by fields can be used, each distinct combination will have a row. To count each dip and dprt combination: stats count by dip dprt
Notice that the dip 126.96.36.199 only has two entries, where in the preceding example it had three. This is because one of the events was ICMP and has no dprt. Any event that doesn’t have ALL of the by fields will not be shown.
Both examples on the bottom row of the figure are breakdowns by prot (protocol) and show the same numerical results.
Count the events by protocol using a by field (creating a row for each distinct protocol):
stats count by prot | replace 1 with icmp, 6 with tcp, 17 with udp in prot
The replace command is just to ease comparison and is not needed
Count the events by protocol using conditional counting (creating a column for each distinct protocol listed):
stats count(eval(prot=1)) as icmp count(eval(prot=6)) as tcp count(eval(prot=17)) as udp
While both are “correct”, in some cases data needs to be manipulated with evals and other commands and this can only happen when the data is in the same row.
The second example uses a conditional count; by using an eval in the count, only certain events are counted. This conditional counting must also be accompanied by the “as” command to rename the field created, because all three cannot use the same field name of count. In this case the protocol name was used. While this has some benefits, the downside is that the protocols must be listed by hand, unlike when using the “by” field. By using the correct stats command, preparing your data for further analysis or viewing becomes a lot easier.
For over a decade, Corero has been providing state-of-the-art, highly-effective, automatic DDoS protection solutions for enterprise, hosting and service provider customers around the world. Our SmartWall® DDoS mitigation solutions protect on-premise, cloud, virtual and hybrid environments. If you’d like to learn more, please contact us.