StarRez incident

THD - FTP Service Outage

Major Resolved View vendor source →

StarRez experienced a major incident on May 30, 2024 affecting The Housing Director Cloud Services, lasting 8h 38m. The incident has been resolved; the full update timeline is below.

Started
May 30, 2024, 02:24 PM UTC
Resolved
May 30, 2024, 11:02 PM UTC
Duration
8h 38m
Detected by Pingoru
May 30, 2024, 02:24 PM UTC

Affected components

The Housing Director Cloud Services

Update timeline

  1. investigating May 30, 2024, 02:24 PM UTC

    Starting at approximately 04:00 EDT this morning, The Housing Director's FTP service became unavailable. Engineers were able to recover services starting at 09:17 EDT. A select group of customer's are still experiencing authentication issues, and we are actively investigating this issue. We will have more information within the next hour.

  2. investigating May 30, 2024, 04:26 PM UTC

    We have identified and resolved a secondary issue that was resulting in a subset of customer's receiving a 'permission denied' error when attempting to connect to their FTP server. We are actively monitoring login attempts, to confirm if there are any further issues preventing access to THD FTP services.

  3. monitoring May 30, 2024, 05:27 PM UTC

    We have confirmed no FTP login issues have occurred since 12:23 EDT. We will continue to monitor to confirm no further issues arise. Next update will be provided by 5pm EDT. A Root Cause Analysis will be posted within the next week.

  4. resolved May 30, 2024, 11:02 PM UTC

    All THD FTP services have been confirmed stable. An RCA will be posted shortly.

  5. postmortem Jun 04, 2024, 02:12 PM UTC

    **THD SFTP Services Outage May 30th, 2024** THD SFTP services became unavailable, impacting file import and export services. **Root Cause** At 04:00 EDT ongoing security patching and maintenance work resulted in SFTP services becoming unavailable. **Resolution** From 09:15 EDT SFTP access for a subset of customers was restored. Further configuration work was required to restore services for the remaining customer base, with full restoration occurring at 12:23 EDT. **Next Steps** We are currently conducting a thorough review of the security patching and maintenance change management processes to determine if any updates can be made to prevent similar issues in the future.