gaiia software incident

Elevated rate of API errors

Minor Resolved View vendor source →

gaiia software experienced a minor incident on September 18, 2023 affecting Public GraphQL API, lasting 3h 21m. The incident has been resolved; the full update timeline is below.

Started
Sep 18, 2023, 05:50 PM UTC
Resolved
Sep 18, 2023, 09:11 PM UTC
Duration
3h 21m
Detected by Pingoru
Sep 18, 2023, 05:50 PM UTC

Affected components

Public GraphQL API

Update timeline

  1. identified Sep 18, 2023, 05:50 PM UTC

    AWS is experiencing elevated network latency causing the gaiia API to time out randomly. Error rate is around 3% at this time. AWS has not posted about this issue yet, but we can see a spike in outages reported here: https://downdetector.com/status/aws-amazon-web-services/.

  2. monitoring Sep 18, 2023, 06:39 PM UTC

    Error rate is now below 1% and we continue to monitor the situation.

  3. monitoring Sep 18, 2023, 07:00 PM UTC

    AWS finally issued a statement: > Sep 18 11:56 AM PDT We can confirm increased networking latencies and errors affecting a single Availability Zone (use1-az1) in the US-EAST-1 Region. We have identified a potential root cause of the errors and are attempting mitigations. Early signs are this mitigation is reducing error rates and latencies. We continue to work towards a full root case and recovery.

  4. resolved Sep 18, 2023, 09:11 PM UTC

    1 hour without API connectivity issues from AWS' side. Considering the issue resolved.