Splunk incident

Searches impacted | CINC - 55235 ( Degraded Performance )

Splunk experienced a major incident on February 16, 2024 affecting Search and Index, lasting 2d 17h. The incident has been resolved; the full update timeline is below.

Started: Feb 16, 2024, 03:50 AM UTC
Resolved: Feb 18, 2024, 09:45 PM UTC
Duration: 2d 17h
Detected by Pingoru: Feb 16, 2024, 03:50 AM UTC

Affected components

SearchIndex

Update timeline

identified Feb 16, 2024, 03:50 AM UTC

We are investigating a potential issue where Splunk instances are experiencing out-of-memory events, causing searches to fail or take longer to complete on multiple Indexers and search heads that may impact several Splunk cloud platform customers. Our teams are working to resolve this issue. Your patience is greatly appreciated and we will provide more updates upon resolution.
identified Feb 16, 2024, 04:30 AM UTC

Splunk has identified the issue causing memory growth on the indexers. Splunk is taking steps to remediate over the next 3 hours.
identified Feb 16, 2024, 08:14 AM UTC

Our technical resources has identified the issue. The hot-fix has been applied on the fleet. Please be aware, once the issue will be confirmed as resolved the Root Cause Messaging will be shared accordingly.
monitoring Feb 16, 2024, 07:58 PM UTC

We have implemented a fix for this issue. During this time we are monitoring the results to confirm the resolution and will continue to provide any additional updates once available.
monitoring Feb 17, 2024, 06:50 AM UTC

We have implemented a fix for this issue and all Cx stacks have been repaired. Remaining are Single Instance stacks are expected to complete 2/17/24 PDT. During this time we are monitoring the results to confirm the resolution and will continue to provide any additional updates once available.
resolved Feb 18, 2024, 09:45 PM UTC

We have updated automation to apply the fix upon new instance creation eliminating the impact of this incident.