Esper.io experienced a minor incident on March 2, 2024 affecting Esper Systems, lasting 18h 50m. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- identified Mar 02, 2024, 04:49 PM UTC
Our team has identified an issue causing degraded performance in message processing from devices. These messages include device status update, telemetry and command updates. We’re working to resolve it. Which services are affected? API: No Console: Yes (delayed last seen, incorrect online/offline status, delayed graphs and alerts). All other console operations are working as expected. Devices: No (device operations continue to work as expected)
- identified Mar 02, 2024, 05:39 PM UTC
We've identified the fix and working on deploying this for all customers. The problems was with one of our message processing systems which had slowed down and caused the lag.
- identified Mar 02, 2024, 10:26 PM UTC
We have rolled out the fix for most of the customers and the message processing lag is steadily going down. Some customers will continue to see delays in Last Seen, Online/Offline Status, and Command processing till the lag comes to 0. We've also scaled up the infrastructure to speed up the process.
- monitoring Mar 03, 2024, 09:54 AM UTC
We no longer see delayed processing of messages. We're continuing to monitor.
- resolved Mar 03, 2024, 11:39 AM UTC
This incident has been resolved.