Omnivore experienced a major incident on April 28, 2023 affecting API and Brink and 1 more component, lasting 1h 49m. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- identified Apr 28, 2023, 06:20 PM UTC
Around 18:00 UTC, we noticed that our CloudPOS Scheduler queue had an elevated number of tasks waiting to be run. This would likely cause all CloudPOS data to be stale, including Tickets and Clock Entries. It would also lead to delayed webhooks. We are currently scaling up our Scheduler Workers to process the delayed tasks.
- monitoring Apr 28, 2023, 07:02 PM UTC
After scaling up our Scheduler Workers, the queue size has shrunk by ~75%. We will continue to monitor until the queue size is back to baseline.
- monitoring Apr 28, 2023, 07:56 PM UTC
As of 19:41 UTC, the Scheduler Queue has returned to baseline. We have confirmed that POS data has been refreshed for all affected POS types (Brink, Toast, Cloud Connect, Lavu, and Lightspeed), including seeing current day Tickets. Webhooks have resumed as well. With the acute phase of the incident being over, we will check for any other impacts before closing the incident.
- resolved Apr 28, 2023, 08:09 PM UTC
After further investigation, we see no other impacts to address. All systems appear to be fully operational.