Platform9 incident

[Intermittent] Degraded Access to Management Planes

Major Resolved View vendor source →

Platform9 experienced a major incident on September 5, 2023 affecting Platform9 Cloud, lasting 18h 28m. The incident has been resolved; the full update timeline is below.

Started
Sep 05, 2023, 05:31 PM UTC
Resolved
Sep 06, 2023, 11:59 AM UTC
Duration
18h 28m
Detected by Pingoru
Sep 05, 2023, 05:31 PM UTC

Affected components

Platform9 Cloud

Update timeline

  1. investigating Sep 05, 2023, 05:31 PM UTC

    We are currently investigating an issue impacting multiple customers' ability to access their management plane instances. Attempts will sometimes result in 502 errors. We will continue to update this post as more details become available.

  2. investigating Sep 05, 2023, 05:31 PM UTC

    We are continuing to investigate this issue.

  3. identified Sep 05, 2023, 08:48 PM UTC

    We've identified a related bottleneck where we're observing network congestion on the backend and we've formulated a plan to attempt to mitigate it. The management plane may still be unstable at this time until we've taken these full mitigation steps.

  4. monitoring Sep 06, 2023, 06:42 AM UTC

    A fix has been implemented and we are monitoring the results.

  5. resolved Sep 06, 2023, 11:59 AM UTC

    This incident has been resolved.