Eptura Workplace incident
We are investigating an issue with Eptura Workplace
Eptura Workplace experienced a notice incident on July 10, 2024, lasting 2h 17m. The incident has been resolved; the full update timeline is below.
Update timeline
- investigating Jul 10, 2024, 08:47 PM UTC
We are currently investigating an issue with Eptura Workplace. We will update you when we have more information.
- monitoring Jul 10, 2024, 10:10 PM UTC
A fix has been implemented and we are monitoring the results.
- resolved Jul 10, 2024, 11:05 PM UTC
This incident has been resolved.
- postmortem Jul 23, 2024, 04:49 PM UTC
**Eptura Workplace detailed Root Cause Analysis | July 10, 2024** **S1 – New Calendar Service using Exchange Unavailable** We are truly grateful for your continued support and loyalty. We value your feedback and appreciate your patience as we worked to resolve this incident. **Description:** On the afternoon of July 10, 2024, our internal team at Eptura Workplace, discovered that the new calendar service using Exchange had been inadvertently removed from our production environment. This issue only affected customers using the new calendar service. **Type of Event:** Eptura Workplace had experienced a Severity 1 outage regarding the new calendar service that uses Exchange. **Services/Modules impacted:** This disruption impacted our Production environment for the new calendar service **Timeline: \(Reported in CST\)** 3:34pm: Our Support and Engineering team discovered the disruption and alerted the Eptura-Workplace-Fire-Alarm channel making internal teams aware that the new calendar services are unavailable. 3:38pm: The Cloud Operations team and their leadership were notified. 3:46pm: PagerDuty was triggered to alert other internal team members. All customers were made aware of this incident through the Status Page, and have begun investigating the issue. 5:01pm: The Cloud Operations team confirmed the new calendar service release from the week prior was reinstated and the service is up and running, and for engineering to confirm. 5:06pm: Engineering confirms the new calendar service that uses Exchange is up and running again. 5:11pm: The Status Page was updated from Investigation to a Monitoring phase. 5:08pm: Product requests for engineering to prepare for re: data recovery and restoration of missed events for customers. 5:40pm: Product confirms that engineering has manually run the event sync looper for the list of impacted customer sites. Next steps would be for them to monitor and report any missing activity. Support will reach out to customers. 6:05pm: The Status Page was marked as resolved. **Total Duration of Event:** 2 Hours 31 Minutes **Root Cause:** A manual error occurred when a pull request was inadvertently merged into the production environment by a member of our Cloud Operations team. This action resulted in the removal of a new calendar service that uses Exchange from production. Due to insufficient monitoring, no alert was raised, delaying our awareness and response to the issue. **Remediation:** The new calendar service was restored promptly by rolling back the pull request. We understand the criticality of this service and took immediate action to rectify the situation. **Preventative Action:** To prevent this incident from occurring in the future, we have implemented and successfully tested uptime alerts for the new calendar service. This will ensure any disruptions are promptly detected and addressed, minimizing impact on our customers.