Re: Bin command versus placing _time span in tstat...

Corky_ · ‎05-02-2025

Hello,

I wish to know the functional difference (if any) between the following:

| tstats count FROM datamodel=Endpoint.Processes where Processes.user=SYSTEM by _time span=1h Processes.dest ...

And

| tstats count FROM datamodel=Endpoint.Processes where Processes.user=SYSTEM by Processes.dest ...
| bin _time span=1h

I understand the function and that "| bin" would always be used for a non tstats search, but within tstats is there any reason to place the "span" within the "by", or is it just cleaner/slightly faster?

Thanks in advance!

livehybrid · ‎05-02-2025

Hi @Corky_

Regarding the first option of applying the span after the _time and before other fields in the "BY" of your tstats command, I personally prefer to put the span at the end rather than in the middle of the by list to keep it cleaner and not to be confused with a field. The tstats docs also suggests it should be at the end:

[ BY (<field-list> | (PREFIX(<field>))) [span=<timespan>]]

The second query Im confused as to how you could bin by _time with tstats if you havent specified _time in the by clause initially. If you do not split by _time in the initial part of the query then the _time field wont be available to the bin command.
FWIW - I find the bin command good for doing stats by multiple fields over _time, when you cannot do with timechart.

🌟 Did this answer help you? If so, please consider:

Adding karma to show it was useful
Marking it as the solution if it resolved your issue
Commenting if you need any clarification

Your feedback encourages the volunteers in this community to continue contributing

PickleRick · ‎05-02-2025

If you do tstats by time without binning and then do bin, you'll have to stats again to summarise your data. Bin on its own doesn't aggregate data, just aligns the field into discrete points.

richgalloway · ‎05-02-2025

The first example will produce a count of destinations, etc, for each hour of the search time window. Something like this

_time	Processes.dest	count
12:00	foo	2
12:00	bar	1
13:00	foo	4
13:00	bar	2

The second example will produce counts by destination, etc. The counts will not be broken down by time.

Processes.dest	count
foo	6
bar	3

The bin command will have no effect because there is no _time field at that point.

Putting span in the tstats command gives you control over the bin sizes. Without span, tstats will choose a span it thinks best fits the data.

---
If this reply helps you, Karma would be appreciated.

marycordova · ‎05-02-2025

I personally like to put _time span=whatever like you have in your first example everywhere it will work (like with "timechart") since it works and it makes it clear what you are spanning.

For the longest time I was not using timechart and span correctly until I learned you should put the span literally right next to the _time to make sure it is getting applied appropriately, so now I just do that everywhere 😁

But to answer your real question...what is the technical difference...IDK 😋

@marycordova

Bin command versus placing _time span in tstats

other

Prove Your Splunk Prowess at .conf25—No Prereqs Required!

Splunk Observability Cloud's AI Assistant in Action Series: Observability as Code

Splunk Answers Content Calendar, July Edition I

Are you a member of the Splunk Community?

Bin command versus placing _time span in tstats

other

Prove Your Splunk Prowess at .conf25—No Prereqs Required!

Splunk Observability Cloud's AI Assistant in Action Series: Observability as Code

Splunk Answers Content Calendar, July Edition I