Confluence Status: Check if Confluence down or having an outage.

Component	Status
Administration	Active
Authentication and User Management	Active
Cloud to Cloud Migrations - Copy Product Data	Active
Comments	Active
Confluence Automations	Active
Create and Edit	Active
Marketplace Apps	Active
Notifications	Active
Purchasing & Licensing	Active
Search	Active
Server to Cloud Migrations - Copy Product Data	Active
Signup	Active
View Content	Active
Mobile	Active
Android App	Active
iOS App	Active

Issue with Automation and Connect Apps

Description: ### Summary On February 28, 2024, between 12:17 UTC and 15:23 UTC, Jira and Confluence apps built on the Connect platform were unable to perform any actions on behalf of users. Some apps may have retried and later succeeded these actions, whereas others may have failed the request. The incident was detected within four minutes by automated monitoring of service reliability and mitigated by manual scaling of the service which put Atlassian systems into a known good state. The total time to resolution was about three hours and six minutes. ### **Technical Summary** On February 28, 2024, between 12:17 UTC and 15:23 UTC, Jira and Confluence apps built on the Connect platform were unable to perform token exchanges specifically for the purpose of user impersonation requests initiated by the app. The event was triggered by the failure of the oauth-2-authorization-server service to scale as the load increased. The unavailability of this service and apps retrying failing requests created a feedback loop, compounding the impacts of the service not scaling. The problem impacted customers in all regions. The incident was detected within four minutes by automated monitoring of service reliability and mitigated by manual scaling of the service which put Atlassian systems into a known good state. The total time to resolution was about three hours and six minutes. ### **IMPACT** The overall impact was on February 28, 2024, between 12:17 UTC and 15:23 UTC, and impacted Connect apps for Jira and Confluence products that relied on the user impersonation feature_._ The incident caused service disruption to customers in all regions. Apps that made requests to act on behalf of users would have seen some of their requests failing throughout the incident. Where apps had retry mechanisms in place, these requests may have eventually succeeded once the service was in a good state. Impacted apps received HTTP 502 and 503 errors as well as request timeouts when making requests to the oauth-2-authorization-server service. Product functionality such as automation rules in Automation for Jira are partially built on the Connect platform, and some of these were impacted. During the impact window, Automation rules performing rule executions on behalf of a user instead of Automation for Jira failed to authenticate. Rules that failed were recorded in the Automation Audit Log. Additionally, manually triggered rules would have failed to trigger, these will not appear in the Automation Audit Log. Overall, this impacted approximately 2% of all rules run in the impact window. Automation for Confluence was not impacted. ### **ROOT CAUSE** The issue was caused by an increase in traffic to the oauth-2-authorization-server service in the US-East region and the service not autoscaling in response to the increased load. As the service began to fail requests, apps retried the requests, which further increased the service load. By adding additional processing resources \(scaling the nodes\), the service was able to handle the increased load and restore availability. While we have a number of testing and preventative processes in place, this specific issue wasn’t identified because the load conditions had not been encountered previously. The service has operated for many years in its current configuration and has never experienced this particular failure mode where traffic ramped faster than our ability to scale. As such, the scaling controls were not exercised and when required they did not proactively scale the oauth-2-authorization-server service due to the CPU scaling threshold never being reached. ### **REMEDIAL ACTIONS PLAN & NEXT STEPS** We know that outages impact your productivity. We are prioritizing the following improvement actions to avoid repeating this type of incident: * The CPU threshold for scaling the service has been lowered significantly so that scaling will begin much earlier as the service load increases in each region. * We are updating our scaling policy to switch to step scaling in order to more rapidly scale capacity if there are significant load increases. * We have increased the minimum number of nodes for the service and will monitor service behaviour to see what should be the optimal minimum scaling value. * Further analysis of rate limiting being triggered will be undertaken to determine if apps are responding to rate limiting appropriately. The service rate limiting is described in [https://developer.atlassian.com/cloud/confluence/user-impersonation-for-connect-apps/#rate-limiting](https://developer.atlassian.com/cloud/confluence/user-impersonation-for-connect-apps/#rate-limiting). * Longer-term, network-based rate limiting will be explored to avoid a misbehaving app overloading the service. We apologize to customers, partners, and developers whose services were impacted during this incident; we are taking immediate steps to improve the platform’s performance and availability. Thanks, Atlassian Customer Support

Status: Postmortem

Impact: Critical | Started At: Feb. 28, 2024, 1:41 p.m.

Updates:

Time: March 7, 2024, 4:53 a.m.

Status: Postmortem

Update: ### Summary On February 28, 2024, between 12:17 UTC and 15:23 UTC, Jira and Confluence apps built on the Connect platform were unable to perform any actions on behalf of users. Some apps may have retried and later succeeded these actions, whereas others may have failed the request. The incident was detected within four minutes by automated monitoring of service reliability and mitigated by manual scaling of the service which put Atlassian systems into a known good state. The total time to resolution was about three hours and six minutes. ### **Technical Summary** On February 28, 2024, between 12:17 UTC and 15:23 UTC, Jira and Confluence apps built on the Connect platform were unable to perform token exchanges specifically for the purpose of user impersonation requests initiated by the app. The event was triggered by the failure of the oauth-2-authorization-server service to scale as the load increased. The unavailability of this service and apps retrying failing requests created a feedback loop, compounding the impacts of the service not scaling. The problem impacted customers in all regions. The incident was detected within four minutes by automated monitoring of service reliability and mitigated by manual scaling of the service which put Atlassian systems into a known good state. The total time to resolution was about three hours and six minutes. ### **IMPACT** The overall impact was on February 28, 2024, between 12:17 UTC and 15:23 UTC, and impacted Connect apps for Jira and Confluence products that relied on the user impersonation feature_._ The incident caused service disruption to customers in all regions. Apps that made requests to act on behalf of users would have seen some of their requests failing throughout the incident. Where apps had retry mechanisms in place, these requests may have eventually succeeded once the service was in a good state. Impacted apps received HTTP 502 and 503 errors as well as request timeouts when making requests to the oauth-2-authorization-server service. Product functionality such as automation rules in Automation for Jira are partially built on the Connect platform, and some of these were impacted. During the impact window, Automation rules performing rule executions on behalf of a user instead of Automation for Jira failed to authenticate. Rules that failed were recorded in the Automation Audit Log. Additionally, manually triggered rules would have failed to trigger, these will not appear in the Automation Audit Log. Overall, this impacted approximately 2% of all rules run in the impact window. Automation for Confluence was not impacted. ### **ROOT CAUSE** The issue was caused by an increase in traffic to the oauth-2-authorization-server service in the US-East region and the service not autoscaling in response to the increased load. As the service began to fail requests, apps retried the requests, which further increased the service load. By adding additional processing resources \(scaling the nodes\), the service was able to handle the increased load and restore availability. While we have a number of testing and preventative processes in place, this specific issue wasn’t identified because the load conditions had not been encountered previously. The service has operated for many years in its current configuration and has never experienced this particular failure mode where traffic ramped faster than our ability to scale. As such, the scaling controls were not exercised and when required they did not proactively scale the oauth-2-authorization-server service due to the CPU scaling threshold never being reached. ### **REMEDIAL ACTIONS PLAN & NEXT STEPS** We know that outages impact your productivity. We are prioritizing the following improvement actions to avoid repeating this type of incident: * The CPU threshold for scaling the service has been lowered significantly so that scaling will begin much earlier as the service load increases in each region. * We are updating our scaling policy to switch to step scaling in order to more rapidly scale capacity if there are significant load increases. * We have increased the minimum number of nodes for the service and will monitor service behaviour to see what should be the optimal minimum scaling value. * Further analysis of rate limiting being triggered will be undertaken to determine if apps are responding to rate limiting appropriately. The service rate limiting is described in [https://developer.atlassian.com/cloud/confluence/user-impersonation-for-connect-apps/#rate-limiting](https://developer.atlassian.com/cloud/confluence/user-impersonation-for-connect-apps/#rate-limiting). * Longer-term, network-based rate limiting will be explored to avoid a misbehaving app overloading the service. We apologize to customers, partners, and developers whose services were impacted during this incident; we are taking immediate steps to improve the platform’s performance and availability. Thanks, Atlassian Customer Support
Time: Feb. 28, 2024, 5:16 p.m.

Status: Resolved

Update: Between ~12:15 UTC and ~ 15:20 UTC Jira Cloud Automation rules and Connect apps impersonating users were failing. We have scaled up the underlying services and confirmed we aren't observing failures anymore.
Time: Feb. 28, 2024, 4:24 p.m.

Status: Identified

Update: Starting from 13:15 UTC, Automation and Connect Apps affecting certain cloud products We have scaled up the underlying services and we're seeing an improvement in response times and success rates. We continue to investigate the root cause and will provide the next update by 18:00 UTC.
Time: Feb. 28, 2024, 3:12 p.m.

Status: Investigating

Update: We continue to investigate the issue with the Automation and Connect Apps affecting certain cloud products. We are actively working to resolve this issue as quickly as possible.
Time: Feb. 28, 2024, 1:41 p.m.

Status: Investigating

Update: We are investigating an issue with Automation and Connect Apps that is impacting some Cloud products. We will provide more details within the next hour.

Increased authentication errors across multiple products

Description: ### Summary On 21/02/2023, between 2:30am and 4:15am UTC, Atlassian customers using Jira Software, Jira Service Management, Jira Work Management and Confluence Cloud products were unable to view issues or pages. The event was triggered by a change to Atlassian's Network \(Edge\) infrastructure, where an incorrect security credential was deployed. This impacted requests to Atlassian's Cloud originating from the Europe and South Asia regions. The incident was detected within 21 minutes by monitoring and mitigated by a failover to other Edge regions and a rollback of the failed deployment which put Atlassian systems into a known good state. The total time to resolution was about 1 hour and 45 minutes. ### **IMPACT** The failed change impacted 3 out of the 14 Atlassian Cloud regions \(Europe/Frankfort, Europe/Dublin, and India/Mumbai\). Between 21/02/2023 2:30am and 04:15am UTC, end-users may have experience intermittent errors or complete service disruption for multiple Cloud products. As the traffic is directed to Atlassian Cloud using DNS latency-based records, only the traffic originating from locations close to Europe and India was impacted. ### **ROOT CAUSE** A change to our Network Infrastructure used faulty credentials. As a result, customer authentication requests could not be validated, and requests were returned with a 500 or 503 errors. After investigation, it was found that the health-check and tests which should have prevented the faulty credentials to reach the production environment, contained a bug and never indicating a fault. ### **REMEDIAL ACTIONS PLAN & NEXT STEPS** We know that outages impact your productivity. While we have a number of testing and preventative processes in place, this specific issue wasn’t identified in our dev and staging environments because the new credentials were only valid for production. We are prioritizing the following improvement actions to avoid repeating this type of incident: * Improving end-to-end healthchecks * Faster rollback of our infrastructure deployment * Improved monitoring Furthermore, we deploy our changes progressively \(by cloud region\) to avoid broad impact but in this case, our detection and health-checks did not work as expected. To minimise the impact of breaking changes to our environments, we will implement additional preventative measures such as: * Canary and shakedown deployments with automated rollback We apologize to customers whose services were impacted during this incident; we are taking immediate steps to improve the platform’s performance and availability. Thanks, Atlassian Customer Support

Status: Postmortem

Impact: Major | Started At: Feb. 21, 2024, 3:41 a.m.

Updates:

Time: March 1, 2024, 8:24 a.m.

Status: Postmortem

Update: ### Summary On 21/02/2023, between 2:30am and 4:15am UTC, Atlassian customers using Jira Software, Jira Service Management, Jira Work Management and Confluence Cloud products were unable to view issues or pages. The event was triggered by a change to Atlassian's Network \(Edge\) infrastructure, where an incorrect security credential was deployed. This impacted requests to Atlassian's Cloud originating from the Europe and South Asia regions. The incident was detected within 21 minutes by monitoring and mitigated by a failover to other Edge regions and a rollback of the failed deployment which put Atlassian systems into a known good state. The total time to resolution was about 1 hour and 45 minutes. ### **IMPACT** The failed change impacted 3 out of the 14 Atlassian Cloud regions \(Europe/Frankfort, Europe/Dublin, and India/Mumbai\). Between 21/02/2023 2:30am and 04:15am UTC, end-users may have experience intermittent errors or complete service disruption for multiple Cloud products. As the traffic is directed to Atlassian Cloud using DNS latency-based records, only the traffic originating from locations close to Europe and India was impacted. ### **ROOT CAUSE** A change to our Network Infrastructure used faulty credentials. As a result, customer authentication requests could not be validated, and requests were returned with a 500 or 503 errors. After investigation, it was found that the health-check and tests which should have prevented the faulty credentials to reach the production environment, contained a bug and never indicating a fault. ### **REMEDIAL ACTIONS PLAN & NEXT STEPS** We know that outages impact your productivity. While we have a number of testing and preventative processes in place, this specific issue wasn’t identified in our dev and staging environments because the new credentials were only valid for production. We are prioritizing the following improvement actions to avoid repeating this type of incident: * Improving end-to-end healthchecks * Faster rollback of our infrastructure deployment * Improved monitoring Furthermore, we deploy our changes progressively \(by cloud region\) to avoid broad impact but in this case, our detection and health-checks did not work as expected. To minimise the impact of breaking changes to our environments, we will implement additional preventative measures such as: * Canary and shakedown deployments with automated rollback We apologize to customers whose services were impacted during this incident; we are taking immediate steps to improve the platform’s performance and availability. Thanks, Atlassian Customer Support
Time: Feb. 21, 2024, 4:57 a.m.

Status: Resolved

Update: Between 2:30 UTC to 4:26 UTC, we experienced increased authentication errors for Confluence, Jira Work Management, Jira Service Management, Jira Software, and Atlassian Bitbucket. The issue has been resolved and the service is operating normally.
Time: Feb. 21, 2024, 4:22 a.m.

Status: Monitoring

Update: We have identified the root cause of the authentication errors and have mitigated the problem. We are now monitoring this closely and will provide further updates within the hour.
Time: Feb. 21, 2024, 3:41 a.m.

Status: Investigating

Update: We are investigating authentication issues impacting some Confluence, Jira Work Management, Jira Service Management, Jira Software, and Atlassian Bitbucket Cloud customers. We will provide more details within the next hour.

Increased authentication errors across multiple products

Description: ### Summary On 21/02/2023, between 2:30am and 4:15am UTC, Atlassian customers using Jira Software, Jira Service Management, Jira Work Management and Confluence Cloud products were unable to view issues or pages. The event was triggered by a change to Atlassian's Network \(Edge\) infrastructure, where an incorrect security credential was deployed. This impacted requests to Atlassian's Cloud originating from the Europe and South Asia regions. The incident was detected within 21 minutes by monitoring and mitigated by a failover to other Edge regions and a rollback of the failed deployment which put Atlassian systems into a known good state. The total time to resolution was about 1 hour and 45 minutes. ### **IMPACT** The failed change impacted 3 out of the 14 Atlassian Cloud regions \(Europe/Frankfort, Europe/Dublin, and India/Mumbai\). Between 21/02/2023 2:30am and 04:15am UTC, end-users may have experience intermittent errors or complete service disruption for multiple Cloud products. As the traffic is directed to Atlassian Cloud using DNS latency-based records, only the traffic originating from locations close to Europe and India was impacted. ### **ROOT CAUSE** A change to our Network Infrastructure used faulty credentials. As a result, customer authentication requests could not be validated, and requests were returned with a 500 or 503 errors. After investigation, it was found that the health-check and tests which should have prevented the faulty credentials to reach the production environment, contained a bug and never indicating a fault. ### **REMEDIAL ACTIONS PLAN & NEXT STEPS** We know that outages impact your productivity. While we have a number of testing and preventative processes in place, this specific issue wasn’t identified in our dev and staging environments because the new credentials were only valid for production. We are prioritizing the following improvement actions to avoid repeating this type of incident: * Improving end-to-end healthchecks * Faster rollback of our infrastructure deployment * Improved monitoring Furthermore, we deploy our changes progressively \(by cloud region\) to avoid broad impact but in this case, our detection and health-checks did not work as expected. To minimise the impact of breaking changes to our environments, we will implement additional preventative measures such as: * Canary and shakedown deployments with automated rollback We apologize to customers whose services were impacted during this incident; we are taking immediate steps to improve the platform’s performance and availability. Thanks, Atlassian Customer Support

Status: Postmortem

Impact: Major | Started At: Feb. 21, 2024, 3:41 a.m.

Updates:

Time: March 1, 2024, 8:24 a.m.

Status: Postmortem

Update: ### Summary On 21/02/2023, between 2:30am and 4:15am UTC, Atlassian customers using Jira Software, Jira Service Management, Jira Work Management and Confluence Cloud products were unable to view issues or pages. The event was triggered by a change to Atlassian's Network \(Edge\) infrastructure, where an incorrect security credential was deployed. This impacted requests to Atlassian's Cloud originating from the Europe and South Asia regions. The incident was detected within 21 minutes by monitoring and mitigated by a failover to other Edge regions and a rollback of the failed deployment which put Atlassian systems into a known good state. The total time to resolution was about 1 hour and 45 minutes. ### **IMPACT** The failed change impacted 3 out of the 14 Atlassian Cloud regions \(Europe/Frankfort, Europe/Dublin, and India/Mumbai\). Between 21/02/2023 2:30am and 04:15am UTC, end-users may have experience intermittent errors or complete service disruption for multiple Cloud products. As the traffic is directed to Atlassian Cloud using DNS latency-based records, only the traffic originating from locations close to Europe and India was impacted. ### **ROOT CAUSE** A change to our Network Infrastructure used faulty credentials. As a result, customer authentication requests could not be validated, and requests were returned with a 500 or 503 errors. After investigation, it was found that the health-check and tests which should have prevented the faulty credentials to reach the production environment, contained a bug and never indicating a fault. ### **REMEDIAL ACTIONS PLAN & NEXT STEPS** We know that outages impact your productivity. While we have a number of testing and preventative processes in place, this specific issue wasn’t identified in our dev and staging environments because the new credentials were only valid for production. We are prioritizing the following improvement actions to avoid repeating this type of incident: * Improving end-to-end healthchecks * Faster rollback of our infrastructure deployment * Improved monitoring Furthermore, we deploy our changes progressively \(by cloud region\) to avoid broad impact but in this case, our detection and health-checks did not work as expected. To minimise the impact of breaking changes to our environments, we will implement additional preventative measures such as: * Canary and shakedown deployments with automated rollback We apologize to customers whose services were impacted during this incident; we are taking immediate steps to improve the platform’s performance and availability. Thanks, Atlassian Customer Support
Time: Feb. 21, 2024, 4:57 a.m.

Status: Resolved

Update: Between 2:30 UTC to 4:26 UTC, we experienced increased authentication errors for Confluence, Jira Work Management, Jira Service Management, Jira Software, and Atlassian Bitbucket. The issue has been resolved and the service is operating normally.
Time: Feb. 21, 2024, 4:22 a.m.

Status: Monitoring

Update: We have identified the root cause of the authentication errors and have mitigated the problem. We are now monitoring this closely and will provide further updates within the hour.
Time: Feb. 21, 2024, 3:41 a.m.

Status: Investigating

Update: We are investigating authentication issues impacting some Confluence, Jira Work Management, Jira Service Management, Jira Software, and Atlassian Bitbucket Cloud customers. We will provide more details within the next hour.

Service Disruptions Affecting Atlassian Products

Description: ### **Summary** On February 14, 2024, between 20:05 UTC and 23:03 UTC, Atlassian customers on the following cloud products encountered a service disruption: Access, Atlas, Atlassian Analytics, Bitbucket, Compass, Confluence, Ecosystem apps, Jira Service Management, Jira Software, Jira Work Management, Jira Product Discovery, Opsgenie, StatusPage, and Trello. As part of a security and compliance uplift, we had scheduled the deletion of unused and legacy domain names used for internal service-to-service connections. Active domain names were incorrectly deleted during this event. This impacted all cloud customers across all regions. The issue was identified and resolved through the rollback of the faulty deployment to restore the domain names and Atlassian systems to a stable state. The time to resolution was two hours and 58 minutes. ### **IMPACT** External customers started reporting issues with Atlassian cloud products at 20:52 UTC. The impact of the failed change led to performance degradation or in some cases, complete service disruption. Symptoms experienced by end-users were unsuccessful page loads and/or failed interactions with our cloud products. ### **ROOT CAUSE** As part of a security and compliance uplift, we had scheduled the deletion of unused and legacy domain names that were being used for internal service-to-service connections. Active domain names were incorrectly deleted during this operation. ### **REMEDIAL ACTIONS PLAN & NEXT STEPS** We know that outages impact your productivity. The detection was delayed because existing testing & monitoring focused on service health rather than the entire system’s availability. To prevent a recurrence of this type of incident, we are implementing the following improvement measures: * Canary checks to monitor the entire system availability. * Faster rollback procedures for this type of service impact. * Stricter change control procedures for infrastructure modifications. * Migration of all DNS records to centralised management and stricter access controls on modification to DNS records. We apologize to customers whose services were impacted during this incident; we are taking immediate steps to improve the platform’s performance and availability. ‌ Thanks, Atlassian Customer Support

Status: Postmortem

Impact: None | Started At: Feb. 14, 2024, 9:57 p.m.

Updates:

Time: Feb. 27, 2024, 5:43 a.m.

Status: Postmortem

Update: ### **Summary** On February 14, 2024, between 20:05 UTC and 23:03 UTC, Atlassian customers on the following cloud products encountered a service disruption: Access, Atlas, Atlassian Analytics, Bitbucket, Compass, Confluence, Ecosystem apps, Jira Service Management, Jira Software, Jira Work Management, Jira Product Discovery, Opsgenie, StatusPage, and Trello. As part of a security and compliance uplift, we had scheduled the deletion of unused and legacy domain names used for internal service-to-service connections. Active domain names were incorrectly deleted during this event. This impacted all cloud customers across all regions. The issue was identified and resolved through the rollback of the faulty deployment to restore the domain names and Atlassian systems to a stable state. The time to resolution was two hours and 58 minutes. ### **IMPACT** External customers started reporting issues with Atlassian cloud products at 20:52 UTC. The impact of the failed change led to performance degradation or in some cases, complete service disruption. Symptoms experienced by end-users were unsuccessful page loads and/or failed interactions with our cloud products. ### **ROOT CAUSE** As part of a security and compliance uplift, we had scheduled the deletion of unused and legacy domain names that were being used for internal service-to-service connections. Active domain names were incorrectly deleted during this operation. ### **REMEDIAL ACTIONS PLAN & NEXT STEPS** We know that outages impact your productivity. The detection was delayed because existing testing & monitoring focused on service health rather than the entire system’s availability. To prevent a recurrence of this type of incident, we are implementing the following improvement measures: * Canary checks to monitor the entire system availability. * Faster rollback procedures for this type of service impact. * Stricter change control procedures for infrastructure modifications. * Migration of all DNS records to centralised management and stricter access controls on modification to DNS records. We apologize to customers whose services were impacted during this incident; we are taking immediate steps to improve the platform’s performance and availability. ‌ Thanks, Atlassian Customer Support
Time: Feb. 14, 2024, 11:32 p.m.

Status: Resolved

Update: We experienced increased errors on Confluence, Jira Work Management, Jira Service Management, Jira Software, Opsgenie, Trello, Atlassian Bitbucket, Atlassian Access, Jira Align, Jira Product Discovery, Atlas, Compass, and Atlassian Analytics. The issue has been resolved and the services are operating normally.
Time: Feb. 14, 2024, 10:55 p.m.

Status: Monitoring

Update: We have identified the root cause of the Service Disruptions affecting all Atlassian products and have mitigated the problem. We are now monitoring this closely.
Time: Feb. 14, 2024, 10:31 p.m.

Status: Identified

Update: We have identified the root cause of the increased errors and have mitigated the problem. We continue to work on resolving the issue and monitoring this closely.
Time: Feb. 14, 2024, 9:57 p.m.

Status: Investigating

Update: We are investigating reports of intermittent errors for all Cloud Customers across all Atlassian products. We will provide more details once we identify the root cause.

Service Disruptions Affecting Atlassian Products

Description: ### **Summary** On February 14, 2024, between 20:05 UTC and 23:03 UTC, Atlassian customers on the following cloud products encountered a service disruption: Access, Atlas, Atlassian Analytics, Bitbucket, Compass, Confluence, Ecosystem apps, Jira Service Management, Jira Software, Jira Work Management, Jira Product Discovery, Opsgenie, StatusPage, and Trello. As part of a security and compliance uplift, we had scheduled the deletion of unused and legacy domain names used for internal service-to-service connections. Active domain names were incorrectly deleted during this event. This impacted all cloud customers across all regions. The issue was identified and resolved through the rollback of the faulty deployment to restore the domain names and Atlassian systems to a stable state. The time to resolution was two hours and 58 minutes. ### **IMPACT** External customers started reporting issues with Atlassian cloud products at 20:52 UTC. The impact of the failed change led to performance degradation or in some cases, complete service disruption. Symptoms experienced by end-users were unsuccessful page loads and/or failed interactions with our cloud products. ### **ROOT CAUSE** As part of a security and compliance uplift, we had scheduled the deletion of unused and legacy domain names that were being used for internal service-to-service connections. Active domain names were incorrectly deleted during this operation. ### **REMEDIAL ACTIONS PLAN & NEXT STEPS** We know that outages impact your productivity. The detection was delayed because existing testing & monitoring focused on service health rather than the entire system’s availability. To prevent a recurrence of this type of incident, we are implementing the following improvement measures: * Canary checks to monitor the entire system availability. * Faster rollback procedures for this type of service impact. * Stricter change control procedures for infrastructure modifications. * Migration of all DNS records to centralised management and stricter access controls on modification to DNS records. We apologize to customers whose services were impacted during this incident; we are taking immediate steps to improve the platform’s performance and availability. ‌ Thanks, Atlassian Customer Support

Status: Postmortem

Impact: None | Started At: Feb. 14, 2024, 9:57 p.m.

Updates:

Time: Feb. 27, 2024, 5:43 a.m.

Status: Postmortem

Update: ### **Summary** On February 14, 2024, between 20:05 UTC and 23:03 UTC, Atlassian customers on the following cloud products encountered a service disruption: Access, Atlas, Atlassian Analytics, Bitbucket, Compass, Confluence, Ecosystem apps, Jira Service Management, Jira Software, Jira Work Management, Jira Product Discovery, Opsgenie, StatusPage, and Trello. As part of a security and compliance uplift, we had scheduled the deletion of unused and legacy domain names used for internal service-to-service connections. Active domain names were incorrectly deleted during this event. This impacted all cloud customers across all regions. The issue was identified and resolved through the rollback of the faulty deployment to restore the domain names and Atlassian systems to a stable state. The time to resolution was two hours and 58 minutes. ### **IMPACT** External customers started reporting issues with Atlassian cloud products at 20:52 UTC. The impact of the failed change led to performance degradation or in some cases, complete service disruption. Symptoms experienced by end-users were unsuccessful page loads and/or failed interactions with our cloud products. ### **ROOT CAUSE** As part of a security and compliance uplift, we had scheduled the deletion of unused and legacy domain names that were being used for internal service-to-service connections. Active domain names were incorrectly deleted during this operation. ### **REMEDIAL ACTIONS PLAN & NEXT STEPS** We know that outages impact your productivity. The detection was delayed because existing testing & monitoring focused on service health rather than the entire system’s availability. To prevent a recurrence of this type of incident, we are implementing the following improvement measures: * Canary checks to monitor the entire system availability. * Faster rollback procedures for this type of service impact. * Stricter change control procedures for infrastructure modifications. * Migration of all DNS records to centralised management and stricter access controls on modification to DNS records. We apologize to customers whose services were impacted during this incident; we are taking immediate steps to improve the platform’s performance and availability. ‌ Thanks, Atlassian Customer Support
Time: Feb. 14, 2024, 11:32 p.m.

Status: Resolved

Update: We experienced increased errors on Confluence, Jira Work Management, Jira Service Management, Jira Software, Opsgenie, Trello, Atlassian Bitbucket, Atlassian Access, Jira Align, Jira Product Discovery, Atlas, Compass, and Atlassian Analytics. The issue has been resolved and the services are operating normally.
Time: Feb. 14, 2024, 10:55 p.m.

Status: Monitoring

Update: We have identified the root cause of the Service Disruptions affecting all Atlassian products and have mitigated the problem. We are now monitoring this closely.
Time: Feb. 14, 2024, 10:31 p.m.

Status: Identified

Update: We have identified the root cause of the increased errors and have mitigated the problem. We continue to work on resolving the issue and monitoring this closely.
Time: Feb. 14, 2024, 9:57 p.m.

Status: Investigating

Update: We are investigating reports of intermittent errors for all Cloud Customers across all Atlassian products. We will provide more details once we identify the root cause.

Is there an Confluence outage?

Confluence status: Systems Active

Confluence outages and incidents

There have been 7 outages or incidents for Confluence in the last 30 days.

Severity Breakdown:

Tired of searching for status updates?

Components and Services Monitored for Confluence

Mobile

Latest Confluence outages and incidents.

Issue with Automation and Connect Apps

Updates:

Increased authentication errors across multiple products

Updates:

Increased authentication errors across multiple products

Updates:

Service Disruptions Affecting Atlassian Products

Updates:

Service Disruptions Affecting Atlassian Products

Updates:

Check the status of similar companies and alternatives to Confluence

Atlassian

Zoom

Dropbox

Miro

TeamViewer

Lucid Software

Restaurant365

Mural

Zenefits

Retool

Splashtop

Hiver

Frequently Asked Questions - Confluence

Is there a Confluence outage?

Where can I find the official status page of Confluence?

How can I get notified if Confluence is down or experiencing an outage?

What does Confluence do?

Start monitoring now!