Jobvite experienced a critical incident on April 17, 2024 affecting Career Sites, lasting 6h 55m. The incident has been resolved; the full update timeline is below.
Affected components
Update timeline
- investigating Apr 17, 2024, 01:23 PM UTC
We are currently investigating this issue
- monitoring Apr 17, 2024, 01:43 PM UTC
A fix has been implemented and we are monitoring the results.
- resolved Apr 17, 2024, 08:18 PM UTC
This incident has been resolved.
- postmortem May 30, 2024, 07:19 PM UTC
Starting on March 12, 2024 our customers may have experienced several instances of outages for their Talemetry Career Sites. Customers would have been impacted briefly on the following dates for periods between 20-50 minutes per outage on March 12, March 18, April 17, April 18, April 25, April 30 and May 2. Our engineers were engaged for investigation upon being alerted of the incidents. Through extensive investigation, our findings pointed to unoptimized procedures in our infrastructure including rise in bot traffic, inefficient queries, load balance, and some performance limitations in the search infrastructure and toolset version. This resulted in our team taking the site down to resolve each instance of the occurrence during this time frame to expedite the short-term resolution while our teams in parallel continuously worked towards long term optimizations. To mitigate this situation from occurring in the future we deployed various optimizations including: * An upgrade of our search infrastructure and search toolset versions * Improved query performance * Appropriately balanced infrastructure to our customers * Added additional monitoring layers to help proactively manage and prevent future incidents The above work was concluded on May 2, 2024. We are also in the process of evaluating bot mitigation technologies to further stabilize and enhance our performance.