Jobvite incident

Talemetry Careersites Unavailable

Critical Resolved View vendor source →

Jobvite experienced a critical incident on March 12, 2024 affecting Career Sites, lasting 6h 45m. The incident has been resolved; the full update timeline is below.

Started
Mar 12, 2024, 05:11 PM UTC
Resolved
Mar 12, 2024, 11:57 PM UTC
Duration
6h 45m
Detected by Pingoru
Mar 12, 2024, 05:11 PM UTC

Affected components

Career Sites

Update timeline

  1. investigating Mar 12, 2024, 05:11 PM UTC

    We're currently investigating an issue with Talemetry Careersites availability

  2. monitoring Mar 12, 2024, 05:31 PM UTC

    The sites are back online, we are monitoring traffic to ensure they are stable.

  3. monitoring Mar 12, 2024, 05:32 PM UTC

    We are continuing to monitor for any further issues.

  4. resolved Mar 12, 2024, 11:57 PM UTC

    This incident has been resolved.

  5. postmortem May 30, 2024, 06:06 PM UTC

    Starting on March 12, 2024 our customers may have experienced several instances of outages for their Talemetry Career Sites. Customers would have been impacted briefly on the following dates for periods between 20-50 minutes per outage on March 12, March 18, April 17, April 18, April 25, April 30 and May 2. Our engineers were engaged for investigation upon being alerted of the incidents. Through extensive investigation, our findings pointed to unoptimized procedures in our infrastructure including rise in bot traffic, inefficient queries, load balance, and some performance limitations in the search infrastructure and toolset version. This resulted in our team taking the site down to resolve each instance of the occurrence during this time frame to expedite the short-term resolution while our teams in parallel continuously worked towards long term optimizations. To mitigate this situation from occurring in the future we deployed various optimizations including: * An upgrade of our search infrastructure and search toolset versions * Improved query performance * Appropriately balanced infrastructure to our customers * Added additional monitoring layers to help proactively manage and prevent future incidents The above work was concluded on May 2, 2024. We are also in the process of evaluating bot mitigation technologies to further stabilize and enhance our performance.