Reinvent Telecom incident

Max UC Chat Service

Minor Resolved View vendor source →

Reinvent Telecom experienced a minor incident on October 6, 2022 affecting Max UC, lasting 5d 3h. The incident has been resolved; the full update timeline is below.

Started
Oct 06, 2022, 03:46 PM UTC
Resolved
Oct 11, 2022, 07:32 PM UTC
Duration
5d 3h
Detected by Pingoru
Oct 06, 2022, 03:46 PM UTC

Affected components

Max UC

Update timeline

  1. investigating Oct 06, 2022, 03:46 PM UTC

    At this time we are investigating an issue with the Max UC Chat Service. Calls will still process as normal through Max UC. Chat and presence only affected.

  2. identified Oct 06, 2022, 04:15 PM UTC

    We have identified an issue with a AMS Server and currently working the problem to recover. Updates will follow.

  3. identified Oct 06, 2022, 04:53 PM UTC

    We are currently still working with Metaswitch to resolve an issue with a AMS Server that is affecting the Chat Service only. Updates will follow.

  4. identified Oct 06, 2022, 05:29 PM UTC

    Metaswitch Engineering continues work to restore the AMS cluster. Chat and presence in MaX UC partially degraded, calls will continue to process.

  5. identified Oct 06, 2022, 06:07 PM UTC

    Metaswitch Engineering continues work to restore the AMS cluster. Chat and presence in MaX UC partially degraded, calls will continue to process.

  6. identified Oct 06, 2022, 06:41 PM UTC

    Metaswitch Engineering continues work to restore the AMS cluster. Chat and presence in MaX UC partially degraded, calls will continue to process.

  7. identified Oct 06, 2022, 07:27 PM UTC

    Metaswitch Engineers have advised that their efforts to restore the AMS server pool have been unsuccessful, and we have given permission to attempt restoration from AMS backups performed overnight. Some data from the early hours today, 10/6, may be lost as a result. Metaswitch expects this process to take approximiately 30 minutes, we will provide an update as soon as we have initial results.

  8. identified Oct 06, 2022, 08:23 PM UTC

    Metaswitch Engineering continues work to restore the AMS cluster. Chat and presence in MaX UC partially degraded, calls will continue to process.

  9. identified Oct 06, 2022, 08:58 PM UTC

    We are beginning to see recovery in AMS with servers now pooled again. Clients may need to logout/login. Work continues to restore group chats, we will update further shortly.

  10. monitoring Oct 06, 2022, 09:28 PM UTC

    We continue to see AMS recovery and stabilization and are moving this Incident to Monitoring. Clients should logout/login if any anomalous behavior is observed and we recommend that group chats be recreated. Please open an Incident in ServiceNow if you continue to experience any issues.

  11. resolved Oct 11, 2022, 07:32 PM UTC

    We continue to see the AMS stabilized and are moving this Incident to Resolved. No other issues have been reported at this time. Thank you for the patience and any inconvenience this has cause.