Published: Jun 22, 2017   -   Updated: Jun 22, 2018
Root Cause Analysis:
Extended Read-Only and Elevated Error Rates
June 19, 2017
Problem description & Impact

On Monday, June 19, 2017, at 8:34 pm PDT, Okta experienced a minor service disruption in US Cell 2 whereby a subset of admins in US Cell 2 may have experienced slightly elevated error rates.  Administrators as well as integrations making API update calls would have also experienced extended Read-Only mode until the issue was fully resolved at 9:08pm PDT.

Root Cause

Issue occurred during planned US Cell 2 database maintenance (planned read-only with an expected maximum duration of 15 minutes.  During the planned maintenance, there was a delay in processing a database migration step which triggered monitoring alerts.  As the threshold for the planned read-only mode had been exceeded, Okta reverted the change and returned the service to normal at 9:08am.

Mitigation steps and future preventative measures
Okta has identified and corrected the script/process error which triggered the extended Read-Only mode and has implemented changes to prevent this issue from recurring.