Splunk Observability Cloud

Can't index span tags

devjeff85
Explorer

I wanted to index the span tag "error" to be able to filter spans by this tag and to create alerts based on this tag. I tried to add a custom MetricSet. Unfortunately, after I start the analysis, I don't see the check mark action to activate my new MetricSet:

devjeff85_0-1726599995553.png

 

I have followed the instructions on this page: 
https://docs.splunk.com/observability/en/apm/span-tags/index-span-tags.html#index-a-new-span-tag-or-...

Labels (1)
0 Karma

bishida
Splunk Employee
Splunk Employee

Hi,

"error" is actually a case where you don't need to index a tag to be able to filter on it. Here is a screen shot of filtering spans where error=true.

bishida_0-1726613966717.png

And here is an example of filtering traces that contain errors:

bishida_1-1726614088425.png


PS - The reason it won't allow you to index "error" as a APM metricset is because "error" isn't actually a span tag so there is nothing to index.

 

devjeff85
Explorer

Thanks for the quick reply. Yes, I've seen this filter switch in the Trace Analyzer, but I also want to create an alert to get notified in case of traces with an error span. It's not possible with the present fields.

Actually I have a dashboard, where I use the metric traces.count and the auto-generated filter field sf_error:true. I can see the results there, but when I create an alert based on the same metric and filter, it is not triggered.

I use a static threshold condition with the following settings:

devjeff85_0-1726645667528.png

  

P. S. You're right "error" is not a tag. I also tried to index on the tag "otel.status_code", but this also wasn't possible.

bishida
Splunk Employee
Splunk Employee

Hi,

Can you try using service.request.count as your signal (filter by sf_error:true and any other relevant filters) and see if that works?

bishida_0-1726680438707.png

 

0 Karma

devjeff85
Explorer

This doesn't trigger the alert either. My original alert (with traces.count) was triggered once during my tests, when I had 3 traces with errors in a short time period, but then it wasn't triggered anymore.

Is there maybe a better way to create an alert for such single events in Splunk? I think, the "static threshold" should be rather used for continuous metrics like CPU usage. But I didn't find any other option so far.  

0 Karma

bishida
Splunk Employee
Splunk Employee

Oh, since it triggered for you once but then didn't trigger again, that might be explained by the alert condition never being cleared. This could be even more likely in a test environment with little traffic. The alerts won't fire again until the previous alert condition has been cleared. There is a setting in the alert to automatically clear after X amount of time if that signal isn't reported. You might want to try that setting. Or try generating successful traffic with no errors over the period of time you're detecting on (e.g., past 15 mins).

0 Karma

devjeff85
Explorer

Yes, I've seen the auto-clear setting and activated it. Still, the alert is not triggered. I think, that this kind of alert (or alert condition) is not suited for single-time events like "an error occurred in a trace", because there is no metric that goes up and down (like CPU usage). This can rather be implemented with Log alerts (in Search & Reporting). 

Do you know a different possibility, how one create an alert for single events that occur in Splunk?

0 Karma
Get Updates on the Splunk Community!

Splunk at Cisco Live 2025: Learning, Innovation, and a Little Bit of Mr. Brightside

Pack your bags (and maybe your dancing shoes)—Cisco Live is heading to San Diego, June 8–12, 2025, and Splunk ...

Splunk App Dev Community Updates – What’s New and What’s Next

Welcome to your go-to roundup of everything happening in the Splunk App Dev Community! Whether you're building ...

The Latest Cisco Integrations With Splunk Platform!

Join us for an exciting tech talk where we’ll explore the latest integrations in Cisco + Splunk! We’ve ...
OSZAR »