Incident Summary
During a scheduled Azure update, the Forms service was automatically restarted. Upon restarting, the service failed to establish a connection with the database. As a result, the Forms service remained in a failed state until it was manually restarted and successfully reconnected to the database.
Root Cause
The outage was triggered by the Forms service being unable to connect to the database immediately after the Azure-driven restart. The lack of sufficient database connection resilience mechanisms caused the service to fail rather than recover automatically.
Resolution
The service was restored by performing an additional restart, at which point it successfully re-established the database connection and resumed normal operations.
Preventative Actions
To improve resilience and prevent recurrence, the following work items have been defined:
Enhanced database connection resilience: Implement improvements to ensure the Forms service can recover from temporary database connectivity issues.
Health check enhancements: Update the Forms service health check to fail explicitly when the database connection is not healthy, enabling faster detection and automated recovery.