Splunk data retention

Karthikeya · ‎04-07-2025

I was newly aligned into a project and didn't have proper KT from the left ones. I have queries regarding my current architecture and configurations and I am not well versed with advanced admin concepts. Please help me in these queries:

We have 6 indexers (hosted on AWS cloud as EC2 but not Splunk cloud) with 6.9TB disk storage and 1.5GB/day license. Is this ok? I am checking for retention period but nowhere set with frozentimeperiodinsecs or maxTotalDataSizeMB in local. But in default it is there... I am also looking whether archival location is set or not.

indexes.conf in Cluster Manager:

[new_index]

homePath = volume:primary/$_index_name/db

coldPath = volume:primary/$_index_name/colddb

thawedPath = $SPLUNK_DB/$_index_name/thaweddb

volumes indexes.conf:

[volume:primary]

path = $SPLUNK_DB

#maxVolumeDataSizeMB = 6000000

there is one more app which is pushing to indexers with indexes.conf: (not at all aware of this)

[default]

remotePath = volume:aws_s3_vol/$_index_name

maxDataSize = 750

[volume:aws_s3_vol]

storageType = remote

path = s3://conn-splunk-prod-smartstore/

remote.s3.auth_region = eu-west-1

remote.s3.bucket_name = conn-splunk-prod-smartstore

remote.s3.encryption = sse-kms

remote.s3.kms.key_id = XXXX

remote.s3.supports_versioning = false

and I don't see coldToFrozenDir and coldToFrozenScript is also not mentioned anywhere.

Now are we storing archival data in S3 bucket now? but there mentioned maxDataSize which is related to hot to warm. So apart from hot bucket data, rest all data is storing in s3 bucket now?

So how will Splunk take the data from S3 bucket to search and make queries?

livehybrid · ‎04-07-2025

Hi

You are using Splunk SmartStore, which offloads warm and cold buckets to remote object storage (S3). Hot buckets remain on local indexer storage until they roll to warm, then get uploaded to S3.

Your remotePath and [volume:aws_s3_vol] config confirms SmartStore is enabled, meaning:

Hot Data and cached warm/cold data resides on indexers
Warm and cold buckets are stored in S3
There is no need for coldToFrozenDir or coldToFrozenScript unless you want to archive frozen data elsewhere. This allows for archiving data which is passed the frozenTimePeriodInSecs to be moved elsewhere.

Retention is controlled by frozenTimePeriodInSecs (age-based) or maxTotalDataSizeMB (size-based). If you don’t override these in local/, defaults apply (usually 6 years retention).

You can run the following command on one of your indexers to confirm the settings which have been applied:

/opt/splunk/bin/splunk btool indexes list --debug | grep -A 10 new_index

Splunk automatically retrieves data from S3 to local cache when searches require it. This is transparent to users but may add latency for cold data which is not already in the cache. When the cache reaches capacity it will "evict" buckets based on the eviction policy which by default is the least-recently used bucket.

Some useful Docs relating to SmartStore and index configuration which might be useful:

🌟 Did this answer help you? If so, please consider:

Adding karma to show it was useful
Marking it as the solution if it resolved your issue
Commenting if you need any clarification

Your feedback encourages the volunteers in this community to continue contributing

PickleRick · ‎04-07-2025

One small correction. With smartstore there is no separate warm/cold storage. A bucket is getting uploaded to remote storage and is being cached locally if needed but it doesn't go through warm->cold lifecycle.

It's also worth noting that with some use cases (especially when you often work with searches covering a significant portion of your remote storage which turns out to be way over your local storage) you might get a significant performance hit because you're effectively not caching anything locally.

Karthikeya · ‎04-07-2025

@PickleRick And will data be deleted in S3 if it reaches to any limit? I mean we didn't set frozentimeperiodinsecs so by default it is 6 years so the older data stays for 6 years in S3?

Karthikeya · ‎04-07-2025

Thanks for this... So my understanding is my index size which is of 500GB by default will never fill at all because once it reaches to 750 MB (Maxdatasize) it will roll over to warm bucket which is in S3 bucket? Am I correct?

livehybrid · ‎04-07-2025

Hi @Karthikeya

For reference, the following docs page is useful for SmartStore retention settings: https://docs.splunk.com/Documentation/Splunk/9.4.1/Indexer/SmartStoredataretention

maxDataSize = Bucket Size in MB, not the total size of the index

Data will be "frozen" when either maxGlobalDataSizeMB or frozenTimePeriodInSecs is met (whichever is first!) - so it is not safe to assume the data will be retained for 6 years if the maxGlobalDataSizeMB setting is not large enough to hold 6 years of data.

To clarify my previous post as @PickleRick mentioned - cold buckets in SmartStore indexes are functionally equivalent to warm buckets - They are essentially the same and cold buckets only exist in circumstances and in any case, the storage on S3 is the same.

Let me know if you have any further questions or need clarity on any of these points 🙂

🌟 Did this answer help you? If so, please consider:

Adding karma to show it was useful
Marking it as the solution if it resolved your issue
Commenting if you need any clarification

Your feedback encourages the volunteers in this community to continue contributing

Splunk data retention

indexer

Splunk at Cisco Live 2025: Learning, Innovation, and a Little Bit of Mr. Brightside

Splunk App Dev Community Updates – What’s New and What’s Next

The Latest Cisco Integrations With Splunk Platform!

Are you a member of the Splunk Community?

Splunk data retention

indexer

Splunk at Cisco Live 2025: Learning, Innovation, and a Little Bit of Mr. Brightside

Splunk App Dev Community Updates – What’s New and What’s Next

The Latest Cisco Integrations With Splunk Platform!