This FAQ will help you understand the details behind Enhanced Disaster Recovery and its differences from standard disaster recovery service.
Helpful definitions for this FAQ:
RTO (Recovery Time Objective): The maximum tolerable length of time that an application or service can be down after a failure or disaster occurs.
RPO (Recovery Point Objective): The maximum acceptable amount of data loss after an unplanned data-loss incident, expressed as an amount of time. This is generally considered the point in time before the event at which data can be successfully recovered.
Enhanced Disaster Recovery FAQ:
|
Question |
Answer | |
|
1 |
What is Enhanced Disaster Recovery? |
The Enhanced Disaster Recovery product provides customers with an option for shorter recovery times in the event of a regional infrastructure-related outage. |
|
2 |
When is Enhanced Disaster Recovery utilized for customers? |
In the event of a regional infrastructure-related outage. A regional infrastructure-related outage means Okta operations have confirmed the primary region is severely degraded or down due to an infrastructure-related issue. |
|
3 |
What are some examples of scenarios where Standard and Enhanced Disaster Recovery is applicable for service continuity? |
Standard and Enhanced Disaster Recovery are used to mitigate issues related to regional infrastructure-related outages. Example scenarios where Standard and Enhanced Disaster Recovery are applicable are:
|
|
4 |
What are some examples of scenarios where Standard and Enhanced Disaster Recovery is NOT applicable for service continuity? |
Standard and Enhanced Disaster Recovery are used to mitigate issues related to regional infrastructure-related outages. Example scenarios where Standard and Enhanced Disaster Recovery are NOT applicable are:
|
|
5 |
Does Enhanced Disaster Recovery support data backup and restoration for my Okta configuration? |
No. This falls outside the shared responsibility model Okta provides to its customers. Okta has recommended partners to help specifically address this use case. More is available on Okta.com/integrations/solutions/ |
|
6 |
What is the difference between Enhanced Disaster Recovery and Standard Disaster Recovery? |
Enhanced Disaster Recovery reduces the time it takes to fail over to a disaster recovery region from a primary region in the event of a regional infrastructure-related outage. While in the disaster recovery region, customers have read-only access to core Okta services. RTO (read-only) with Standard Disaster Recovery: Up to 1 hour. RTO (read-only) with Enhanced Disaster Recovery: Up to 5 minutes. RTO (read-write) for both Standard and Enhanced Disaster Recovery: 24 hours. RPO for both Standard and Enhanced Disaster Recovery: up to 1 hour. Additionally, customers with Enhanced Disaster Recovery can failover individual Okta production organizations without the requirement of all customers in the primary region. Standard Disaster Recovery: During an outage requiring a failover, all customers within the primary region will be failed over by the Okta operations team. |
|
7 |
How does a failover happen? |
Standard Disaster Recovery: Only Okta can invoke a failover and failback. The failover can take up to 1 hour to complete once invoked successfully.
Enhanced Disaster Recovery: Okta or a customer can invoke a failover and failback using the self-service portal. The failover can take up to 5 minutes to complete once invoked successfully. Mechanisms to failover:
NOTE: The use of Self-Service is optional; however, customers who invoke a failover via Self-Service are responsible for the failback, as Okta may not always know the reason for the customer-initiated failover.
|
|
8 |
Will Okta still fail over my organizations now that Self-Service is available? |
Yes, using the self-service portal is optional. Okta has extensive monitoring in place to detect and respond to infrastructure-related outages. However, if a failover has already been invoked using self-service, the customer is responsible for failing back since Okta may not know the reason for the failover. |
|
9 |
How does the 5-minute failover time apply? |
The 5-minute failover time applies after the infrastructure-related outage is confirmed and determined by the Okta operations team. Example: If the Okta operations team takes 10 minutes to determine the outage and confirms a failover is needed, the customer's total outage time to service recovery is 15 minutes:
For more information, see the Enhanced Disaster Recovery terms and conditions on Okta.com/agreements. |
|
10 |
What type of errors are expected during a regional infrastructure-related outage? |
Symptoms experienced are elevated authentication failure rates, degraded latency performance, and HTTP error codes (e.g., 404). |
|
11 |
What does read-only mode mean? |
While in read-only mode, Okta services do not accept any new write or change operations. E.g., Adding and removing users, changing user permissions, and any automation completed by Workflows. Existing users can still authenticate, ensuring uninterrupted business operations. Syslog data continues to be collected during a failover but remains inaccessible until the system is returned to the primary region. Reviewing syslog data is not possible while in read-only mode. During a failover, customers are required to re-authenticate in order to re-establish the user's authentication session. This same re-authentication process is also necessary when the system is reverted back to the primary region. This requirement applies to both standard and Enhanced Disaster Recovery scenarios. |
|
12 |
Is Enhanced Disaster Recovery available for all customers in all regions? |
Enhanced Disaster Recovery is available for customers to use in all commercial regions. Support for Enhanced Disaster Recovery is NOT supported on FedRamp or Department of Defense deployments. NOTE: All unsupported deployment locations support Standard Disaster Recovery. |
|
13 |
What if a customer has more than one Okta org? Does the Enhanced Disaster Recovery apply to all orgs within supported regions? |
Yes. Enhanced Disaster Recovery will be enabled for eligible Okta orgs in supported production cells. This should not include any development or testing orgs hosted within production cells. |
|
14 |
Which Okta Workforce Identity Cloud services are NOT supported by Enhanced Disaster Recovery? |
The services below are not considered core and are NOT supported by Enhanced Disaster Recovery.
|
|
15 |
Is there downtime when customers are onboarded to Enhanced Disaster Recovery? |
No. There is no downtime for customers during the onboarding process. For customers using IP allowlists, please add the Okta IP range to your allowlist. During a failover, traffic is routed through the DR region, which has its own unique IPs. |
|
16 |
Are regional outages common? |
No, however, these outages are hard to predict. There are many variables to how a regional infrastructure-related outage can occur. |
|
17 |
Can customers with Enhanced Disaster Recovery request failovers in the event of an outage? |
Yes. Customers can use Enhanced DR Self-Service to failover and failback their organization(s). See this article for more details. |
|
18 |
Can Enhanced Disaster Recovery customers test failovers from primary regions to disaster recovery regions? |
Yes. Customers can use Enhanced DR Self-Service to failover and failback their organization(s). See this article for more details. |
|
19 |
Will end users need to re-authenticate during a failover? |
Yes. End users must re-authenticate during a failover to re-establish the user’s authentication session. They must also re-authenticate again once failed back to the primary region. |
|
20 |
Will the end-user experience be affected during the failback to the primary region? |
Yes. End users must re-authenticate after failback to re-establish the user’s authentication session. Once failback is completed, full read-write access is restored. |
|
21 |
Can customers schedule their failback after a failover has occurred? |
If a customer used Self-Service to failover, then they can failback their organization to the Primary region at their discretion. Okta will not failback a customer’s organization if Self-Service was used since Okta may not know the reason for the customer-initiated failover. |
|
22 |
Can customers leverage rate limits multipliers or dynamic scale when live on the failover region? |
Please reach out to your account team for more information. |
|
23 |
What happens to internal data like syslog events during a Disaster Recovery event? |
Syslog data continues to be collected during a failover but remains inaccessible until the system is returned to the primary region. Reviewing syslog data is not possible while in read-only mode. |
|
24 |
What end-user or admin behaviors are affected during a Disaster Recovery event? |
Any functionality that writes to the database will NOT be available during a failover. This includes but is not limited to adding/removing users, changing user permissions, and any automation completed by Workflows. User authentication is still supported. |
|
25 |
I received an email that my AD Agent is disconnected. Will this prevent users from authenticating? |
This is expected behavior. Once failover is complete, please refer to the status provided on the Directory Integrations page of the Admin UI. Once failback to the primary region has completed, you will receive an email that the AD agent is connected again. |
|
26 |
Where can I find the terms and conditions for the Enhanced Disaster Recovery service? |
See the Okta.com/agreements page for the terms and conditions on the service. |
