So i currently have an indexer cluster, the RF and SF is 2. My hot/warm Db and my cold bucket will be on different storage disks in a cluster which has its own replication features, and i happen to have EC+2:1, meaning the data on my cold storage will be replicated twice.
As a result, i would like to disable replication on my cold storage, but there is currently no way to do that in Splunk(or not that I know of). I am thinking of writing a cron job that deletes all replicated bucket in the cold storage disk. For this to happen, all of the indexers should be referring to a single shared file path in the cold storage.
However, this begs the question: Will the search still works as per normal? Lets say the main bucket is on Indexer A and the replicated copy is in indexer B. But my indexer A is currently under maintenance, would it be possible for lets say, index B, to query the bucket with indexer A's bid? Additionally, will indexer B sense that something is wrong and try to replicate the bucket in warm bucket again?
Shortly you can’t do it. Even you can succeed to remove those “additional” buckets, when splunk recognize that cluster has lost SF or RF it starts to rebuild missed buckets. This will happened again and again until retention time of those buckets have fulfilled.
And even you could do it some weird way you will lose all support from Splunk side when (I don’t say if) you have any issues with your environment.
It’s much better that you configure your storage to avoid that replication or use some other storage instead of it.
Thanks all for the insightful discussion. Upon further research i realized i am not supposed to let my indexers see the other indexers file as well, so that's one more reason why this idea wont work out.
Cheers
Shortly you can’t do it. Even you can succeed to remove those “additional” buckets, when splunk recognize that cluster has lost SF or RF it starts to rebuild missed buckets. This will happened again and again until retention time of those buckets have fulfilled.
And even you could do it some weird way you will lose all support from Splunk side when (I don’t say if) you have any issues with your environment.
It’s much better that you configure your storage to avoid that replication or use some other storage instead of it.
Before proceeding with any changes, it's crucial to test your setup in a staging environment to avoid any disruptions in your production environment. Please contact Splunk support or PS.
NOTE:-
Official answer from support is to NOT remove any replicated buckets even with clustering disabled, as they may be marked as the Primary Bucket. It is best to let them age out.