Zep Software incident

Elevated error rates

Critical Resolved View vendor source →

Zep Software experienced a critical incident on August 27, 2025 affecting API, lasting 1h 37m. The incident has been resolved; the full update timeline is below.

Started
Aug 27, 2025, 11:28 PM UTC
Resolved
Aug 28, 2025, 01:05 AM UTC
Duration
1h 37m
Detected by Pingoru
Aug 27, 2025, 11:28 PM UTC

Affected components

API

Update timeline

  1. investigating Aug 27, 2025, 11:28 PM UTC

    We are currently investigating this issue.

  2. investigating Aug 27, 2025, 11:44 PM UTC

    We are continuing to investigate this issue.

  3. monitoring Aug 28, 2025, 12:52 AM UTC

    During a graph database migration to improve search latency, nodes in cluster ran out of file descriptors. We quickly identified the issue and implemented a fix. Nodes took some time to recover and rejoin the cluster. We will be running a post-mortem to understand and address shortcomings in our systems and processes around scaling our graph database cluster, including maintenance and development processes.

  4. monitoring Aug 28, 2025, 12:53 AM UTC

    We are continuing to monitor for any further issues.

  5. resolved Aug 28, 2025, 01:05 AM UTC

    This incident has been resolved.