Knowledge Management

Bin command versus placing _time span in tstats

Corky_
New Member

Hello,

I wish to know the functional difference (if any) between the following:

| tstats count FROM datamodel=Endpoint.Processes where Processes.user=SYSTEM by _time span=1h Processes.dest ...

And

| tstats count FROM datamodel=Endpoint.Processes where Processes.user=SYSTEM by Processes.dest ...
| bin _time span=1h

I understand the function and that "| bin" would always be used for a non tstats search, but within tstats is there any reason to place the "span" within the "by", or is it just cleaner/slightly faster?

Thanks in advance!

Labels (1)
Tags (2)
0 Karma

livehybrid
Super Champion

Hi @Corky_ 

Regarding the first option of applying the span after the _time and before other fields in the "BY" of your tstats command, I personally prefer to put the span at the end rather than in the middle of the by list to keep it cleaner and not to be confused with a field. The tstats docs also suggests it should be at the end: 

[ BY (<field-list> | (PREFIX(<field>))) [span=<timespan>]]

 The second query Im confused as to how you could bin by _time with tstats if you havent specified _time in the by clause initially. If you do not split by _time in the initial part of the query then the _time field wont be available to the bin command. 
FWIW - I find the bin command good for doing stats by multiple fields over _time, when you cannot do with timechart.  

🌟 Did this answer help you? If so, please consider:

  • Adding karma to show it was useful
  • Marking it as the solution if it resolved your issue
  • Commenting if you need any clarification

Your feedback encourages the volunteers in this community to continue contributing

0 Karma

PickleRick
SplunkTrust
SplunkTrust

If you do tstats by time without binning and then do bin, you'll have to stats again to summarise your data. Bin on its own doesn't aggregate data, just aligns the field into discrete points.

0 Karma

richgalloway
SplunkTrust
SplunkTrust

The first example will produce a count of destinations, etc, for each hour of the search time window.  Something like this

_timeProcesses.destcount
12:00foo2
12:00bar1
13:00foo4
13:00bar2

 

The second example will produce counts by destination, etc.  The counts will not be broken down by time.

Processes.destcount
foo6
bar3

 

The bin command will have no effect because there is no _time field at that point.

Putting span in the tstats command gives you control over the bin sizes.  Without span, tstats will choose a span it thinks best fits the data.

---
If this reply helps you, Karma would be appreciated.
0 Karma

marycordova
SplunkTrust
SplunkTrust

I personally like to put _time span=whatever like you have in your first example everywhere it will work (like with "timechart") since it works and it makes it clear what you are spanning. 

For the longest time I was not using timechart and span correctly until I learned you should put the span literally right next to the _time to make sure it is getting applied appropriately, so now I just do that everywhere 😁

But to answer your real question...what is the technical difference...IDK 😋

@marycordova
0 Karma
Get Updates on the Splunk Community!

The Latest Cisco Integrations With Splunk Platform!

Join us for an exciting tech talk where we’ll explore the latest integrations in Cisco &#43; Splunk! We’ve ...

Enterprise Security Content Update (ESCU) | New Releases

In April, the Splunk Threat Research Team had 2 releases of new security content via the Enterprise Security ...

Stay Connected: Your Guide to May Tech Talks, Office Hours, and Webinars!

What are Community Office Hours?Community Office Hours is an interactive 60-minute Zoom series where ...
OSZAR »