GiveGab incident

Issues with Linode hosted services running custom PHP giving day sites on Kimbia Giving Day Platform

Minor Resolved View vendor source →

GiveGab experienced a minor incident on April 25, 2018, lasting 9h 54m. The incident has been resolved; the full update timeline is below.

Started
Apr 25, 2018, 11:02 AM UTC
Resolved
Apr 25, 2018, 08:56 PM UTC
Duration
9h 54m
Detected by Pingoru
Apr 25, 2018, 11:02 AM UTC

Update timeline

  1. monitoring Apr 25, 2018, 11:02 AM UTC

    At 6:27 this morning, we started to see issues with a number of our custom hosted PHP giving day sites running on Linode. We were sporadically seeing these services return errors that surfaced up through our CDN as 502 bad gateways. Impacted sites were: Monitor is DOWN: Purdue ( https://dayofgiving.purdue.edu/ ) - Reason: HTTP 502 - Bad Gateway Monitor is DOWN: KLRU ( https://supportklru.org/ ) - Reason: HTTP 502 - Bad Gateway Monitor is DOWN: Giving Common ( https://givingcommon.org/ ) - Reason: HTTP 502 - Bad Gateway After resetting the PHP service, we are seeing service return to normal but we are continuing to monitor.

  2. monitoring Apr 25, 2018, 11:33 AM UTC

    We are seeing the issues return and are currently troubleshooting what is happening.

  3. identified Apr 25, 2018, 02:34 PM UTC

    We are currently investigating issues across the board with our custom sites that are running on our Linode servers. Some users will continue to see sporadic Gateway 502 errors as we continue to apply a repeating reset of PHP services on the servers.

  4. identified Apr 25, 2018, 04:19 PM UTC

    We need to perform emergency maintenance on some of our Linode servers in order to rectify an issue we have been seeing. This will result in an outage of roughly 60 minutes for several of our sites including: 1. https://supportklru.org/ 2. https://givingcommon.org/ We apologize for this inconvenience and will continue to provide updates as we make progress on the emergency maintenance.

  5. identified Apr 25, 2018, 04:19 PM UTC

    We are continuing to work on a fix for this issue.

  6. monitoring Apr 25, 2018, 04:32 PM UTC

    The underlying Linode server has been upgraded and emergency maintenance is complete. All sites should be resolving properly and we'll continue to monitor for issues.

  7. monitoring Apr 25, 2018, 04:33 PM UTC

    We are continuing to monitor for any further issues.

  8. monitoring Apr 25, 2018, 06:12 PM UTC

    We are continuing to monitor for any further issues.

  9. resolved Apr 25, 2018, 08:56 PM UTC

    After further monitoring, and a few more updates to the underlying server, issues seem to be resolved.