BL.INK incident

Service Outage

Major Resolved View vendor source →

BL.INK experienced a major incident on June 13, 2023 affecting BL.INK Enterprise: USA and BL.INK Core Platform, lasting 1d 5h. The incident has been resolved; the full update timeline is below.

Started
Jun 13, 2023, 07:05 PM UTC
Resolved
Jun 15, 2023, 12:08 AM UTC
Duration
1d 5h
Detected by Pingoru
Jun 13, 2023, 07:05 PM UTC

Affected components

BL.INK Enterprise: USABL.INK Core Platform

Update timeline

  1. investigating Jun 13, 2023, 07:05 PM UTC

    We are currently experiencing a service outage across redirects and management functions. We will post additional information as soon as it becomes available.

  2. resolved Jun 15, 2023, 12:08 AM UTC

    Root Cause Analysis of outage on June 13, 2023 BLINK experienced an outage on June 13th from approximately 1:52PM to 2:20PM Central time. Impact Assessment: The functions impacted were primary redirect services, multiple API endpoints, and web console functions. This outage impacted both self-service and enterprise customers. Timeline: (All times central. GMT -5) 1:52pm - initial alerts notified of 502 Bad Gateway response 2:08pm - AWS confirmed issues identified in the US-East Region were impacting all lambda functions (which BLINK utilizes in numerous places) 2:17pm - BLINK modified our services to skip the lambda services and process all traffic through another technology. All services started to recover. 2:20pm - All alerts cleared and systems were fully operational again. 4:00pm - AWS indicated that services were restored, but backlogged. BLINK monitored the situation until we felt confident that lambda services were operational. 5:15pm - BLINK reintroduced lambda services back into production. No issues were encountered. Root Cause Identification: The issue stemmed from the lambda functions being used to serve redirects and API calls. BLINK utilizes lambda as a primary application layer in our platform. While BLINK does not depend exclusively on this service, the failover to bypass the lambda layer is manual and required engineer intervention to transition. Action Plan: BLINK has been actively building a new structured architecture over the last 10 months that will provide a fully-dispersed footprint that utilizes multiple regions with immediate, automatic failover. Customers already on this new platform were not impacted by the outage providing real-world validation of the new architecture. This system is active today and we are working to transition to this new infrastructure in the near future. BLINK will begin sharing more details about this update in the coming months. Any immediate questions may be directed to [email protected]