Phrase incident

Degraded Performance of Phrase Strings (EU) components between October 30, 2024 12:00 PM CEST and October 30, 2024 04:30 PM CEST

Minor Resolved View vendor source →

Phrase experienced a minor incident on October 30, 2024 affecting Translation center and Repo sync and 1 more component, lasting 1h 7m. The incident has been resolved; the full update timeline is below.

Started
Oct 30, 2024, 03:56 PM UTC
Resolved
Oct 30, 2024, 05:04 PM UTC
Duration
1h 7m
Detected by Pingoru
Oct 30, 2024, 03:56 PM UTC

Affected components

Translation centerRepo syncEmail deliveryOrderingIn-context editorAPI

Update timeline

  1. investigating Oct 30, 2024, 03:56 PM UTC

    We are currently experiencing a performance disruption causing delays in the Phrase Strings (EU) components. The engineers are working on identifying the cause of the issue. We apologize for the inconvenience.

  2. investigating Oct 30, 2024, 03:58 PM UTC

    We are continuing to investigate this issue.

  3. monitoring Oct 30, 2024, 04:32 PM UTC

    We have identified an issue with processing background jobs taking longer than expected. The root cause has been resolved and jobs are being processed again. We're monitoring that all jobs are running correctly.

  4. resolved Oct 30, 2024, 05:04 PM UTC

    This incident has been resolved.

  5. postmortem Nov 06, 2024, 12:03 PM UTC

    ### **Introduction** We would like to share more details about the events that occurred with Phrase between 12:00 PM CET and 04:30 PM CET on October 30, 2024 which led to the degraded performance of a Phrase Strings \(EU\) component and what Phrase engineers are doing to prevent these issues from reoccurring. ### **Timeline** * October 30, 2024, 12:30 CET: An unusually large number of background jobs was observed. * October 30, 2024, 15:20 CET We identified the cause and commenced working on a patch. * October 30, 2024, 15:30 CET The patch was applied. * October 30, 2024, 16:30 CET: The issue was resolved and queued jobs were processed. ### **Root Cause** * An unusually large number of deletion requests was sent to our system. * As a result, background jobs were queuing up and affecting overall processing time. ### **Actions to Prevent Recurrence** * Deletion requests will be separated and lowered in priority * The select batch size in long running jobs will be reduced to alleviate the possibility of getting stuck.