Knack incident

Degraded performance - All regions

Minor Resolved View vendor source →

Knack experienced a minor incident on December 4, 2024 affecting Live App, lasting 12h 54m. The incident has been resolved; the full update timeline is below.

Started
Dec 04, 2024, 12:14 PM UTC
Resolved
Dec 05, 2024, 01:08 AM UTC
Duration
12h 54m
Detected by Pingoru
Dec 04, 2024, 12:14 PM UTC

Affected components

Live App

Update timeline

  1. identified Dec 04, 2024, 12:14 PM UTC

    15:30 CET - Our engineers are working on a solution at the moment. We will continue to monitor the servers 12:00 PM CET - We are still working on identifying the cause and working on a fix. We have applied some patches that should improve the performance in the region. 8:00 AM CET - We have seen degraded performance in the EU region

  2. identified Dec 04, 2024, 03:22 PM UTC

    Some customers might experience degraded performance in the US region as well

  3. monitoring Dec 04, 2024, 09:40 PM UTC

    A fix has been implemented and we're monitoring the results

  4. resolved Dec 05, 2024, 01:08 AM UTC

    This incident has been resolved.

  5. postmortem Dec 05, 2024, 06:12 PM UTC

    # **Update on Recent Performance Issues** ‌ We want to take a moment to explain the recent performance issues you may have experienced and share what we’ve done to address them. Over the past several weeks, we’ve been rolling out a major platform enhancement - migrating application schemas from the existing NoSQL architecture to be adjacent to your application data in the SQL datastore. This enhancement eliminates the 16 MB app schema size limitation inherent in the NoSQL store so our customers can build larger, more sophisticated apps on the Knack platform. It also opens up technical options to significantly reduce API response time in the coming months. The architectural changes involved are complex. Despite thorough code reviews, extensive testing in our QA and Staging environments, and a carefully phased rollout to minimize impact, the changes led to unforeseen challenges. Earlier this week, the increased SQL data load caused spikes in 500 errors and database failovers, which we know disrupted your experience. We take this very seriously. Yesterday, we applied critical hotfixes to further optimize app schema processing logic, and confirmed that the fixes have remediated the performance issues. We are now closely monitoring the system to ensure stability and smooth operation moving forward. We sincerely apologize for the inconvenience this caused and appreciate your patience. Our team is committed to learning from this experience and implementing measures to prevent similar issues in the future. If you have any further questions or concerns, please don’t hesitate to contact [[email protected]](mailto:[email protected]). Thank you for your continued trust in us.