Last checked: 4 minutes ago
Get notified about any outages, downtime or incidents for Guard and 1800+ other cloud vendors. Monitor 10 companies, for free.
Outage and incident data over the last 30 days for Guard.
Join OutLogger to be notified when any of your vendors or the components you use experience an outage. It's completely free and takes less than 2 minutes!
Sign Up NowOutlogger tracks the status of these components for Xero:
Component | Status |
---|---|
Account Management | Active |
API tokens | Active |
Audit Logs | Active |
Domain Claims | Active |
SAML-based SSO | Active |
Signup | Active |
User Provisioning | Active |
Guard Premium | Active |
Data Classification | Active |
Data Security Policies | Active |
Guard Detect | Active |
View the latest incidents for Guard and check for official updates:
Description: ### **SUMMARY** On December 7, 2021, between 15:54 UTC and December 8, 2021, at 01:55 UTC, Atlassian Cloud services using AWS services in the US-EAST-1 region experienced a failure. This affected customers using Atlassian Access, Bitbucket Cloud, Compass, Confluence Cloud, the Jira family of products, and Trello. Products were unable to operate as expected, resulting in partial or complete degradation of services. The event was triggered by an AWS networking outage in US-EAST-1 affecting multiple AWS services and led to the inability to access AWS APIs and the AWS management console. The incident was first reported by Atlassian Access whose monitoring detected faults accessing DynamoDB services in the region. Recovery of affected Atlassian services occurred on a service-by-service basis from 2021-12-07 21:50 UTC when the underlying AWS services also began to recover. Full recovery of Atlassian Cloud services was notified at 2021-12-08 1:55 UTC. ### **IMPACT** The overall impact occurred between December 7, 2021, between 15:54 UTC and December 8, 2021, at 01:55 UTC_._ The incident caused partial to complete service disruption of Atlassian Cloud services in the US-EAST-1 region. Product-specific impacts are listed below. The primary impact for customers of Jira Software, Jira Service Management and Jira Work Management hosted in the US-EAST-1 region, was being unable to scale up, which caused slow response times for web requests and delays in background job processing, including webhooks in the AP region. There was significant latency for customers accessing Jira. Some customers experienced service unavailability while the incident took place. Jira Align experienced an email outage for US customers due to the AWS Service outage that affected many of the AWS Services including Simple Email Service. A small percentage of Jira Align emails were not sent due to the AWS incident. Bitbucket Pipelines was unavailable and steps failed to be executed. For Jira Automation, tenant’s rules execution were delayed since CloudWatch was affected. Confluence experienced minor impact due to upstream services impacting user management, search, notifications, and media. At the same time Confluence was impacted by error rates related to the inability to scale up, and GraphQL had higher latencies. Trello email-to-board and dashcards features experienced degraded performance. Atlassian Access reported product transfers from one organization failed intermittently. Admins were not able to update features like IP Allowlist, Audit Logs, Data Residency, Custom Domain Email Notification and Mobile Application Management. Yet, users were able to access and view these features. During the incident, emails to admins experienced a delay. There was degraded experience when creating and deleting API tokens. Statuspage was largely unaffected. However, notification workers could not scale up and communications to customers were delayed, though they could be replayed later. The incident also impacted users trying to sign in to manage portals and private pages. Compass experienced a minor impact on its ability to write to its primary database store. No core features were affected. Atlassian's customers could have experienced stale data issues in production, US-EAST-1 for ~30s, against expected 5s at p99, because of delayed token resolution. The provisioning of new cloud tenants was also impacted until the recovery of the services. ### **ROOT CAUSE** The issue was caused by a problem with several network devices within AWS’s internal network. These devices were receiving more traffic than they were able to process, which led to elevated latency and packet loss. As a result, it affected multiple AWS services which Atlassian's platform relies on, causing service degradation and disruption to the products mentioned above. For more information in regards to the root cause, see [Summary of the AWS Service Event in the Northern Virginia \(US-EAST-1\) Region](https://aws.amazon.com/message/12721). There were no relevant Atlassian-driven events in the lead-up that have been identified to cause or contribute to this incident. ### **REMEDIAL ACTIONS PLAN & NEXT STEPS** We know that outages impact your productivity. We are taking immediate steps to improve the Atlassian platform's resiliency and availability to reduce the impact of such an event in the future. While Atlassian's Cloud services do run in several regions \(US EAST and WEST, AP, EU CENTRAL and WEST, among others\) and data is replicated across several regions to increase the resilience against outages of this magnitude, we have identified and are taking actions that include improvements to our region failover process. This will minimize the impact of future outages on Atlassian’s Cloud services and provide better support for our customers. We are prioritizing the following actions to avoid repeating this type of incident: * Enhance and strengthen our plans for cross-region resiliency and disaster recovery plans, including: continue practicing region failover in production, investigate and implement better resilience strategies for services, Active/Active or Active/Passive. * Improving and adopting multi-region architecture for services that do require it. * Excercise wargaming scenarios that will simulate this outage to assess customer view of the incident. This will allow us to create further action items to improve our region failover process. We apologize to customers whose services were impacted during this incident. Thanks, Atlassian Customer Support
Status: Postmortem
Impact: None | Started At: Dec. 7, 2021, 5:39 p.m.
Description: ### **SUMMARY** On December 7, 2021, between 15:54 UTC and December 8, 2021, at 01:55 UTC, Atlassian Cloud services using AWS services in the US-EAST-1 region experienced a failure. This affected customers using Atlassian Access, Bitbucket Cloud, Compass, Confluence Cloud, the Jira family of products, and Trello. Products were unable to operate as expected, resulting in partial or complete degradation of services. The event was triggered by an AWS networking outage in US-EAST-1 affecting multiple AWS services and led to the inability to access AWS APIs and the AWS management console. The incident was first reported by Atlassian Access whose monitoring detected faults accessing DynamoDB services in the region. Recovery of affected Atlassian services occurred on a service-by-service basis from 2021-12-07 21:50 UTC when the underlying AWS services also began to recover. Full recovery of Atlassian Cloud services was notified at 2021-12-08 1:55 UTC. ### **IMPACT** The overall impact occurred between December 7, 2021, between 15:54 UTC and December 8, 2021, at 01:55 UTC_._ The incident caused partial to complete service disruption of Atlassian Cloud services in the US-EAST-1 region. Product-specific impacts are listed below. The primary impact for customers of Jira Software, Jira Service Management and Jira Work Management hosted in the US-EAST-1 region, was being unable to scale up, which caused slow response times for web requests and delays in background job processing, including webhooks in the AP region. There was significant latency for customers accessing Jira. Some customers experienced service unavailability while the incident took place. Jira Align experienced an email outage for US customers due to the AWS Service outage that affected many of the AWS Services including Simple Email Service. A small percentage of Jira Align emails were not sent due to the AWS incident. Bitbucket Pipelines was unavailable and steps failed to be executed. For Jira Automation, tenant’s rules execution were delayed since CloudWatch was affected. Confluence experienced minor impact due to upstream services impacting user management, search, notifications, and media. At the same time Confluence was impacted by error rates related to the inability to scale up, and GraphQL had higher latencies. Trello email-to-board and dashcards features experienced degraded performance. Atlassian Access reported product transfers from one organization failed intermittently. Admins were not able to update features like IP Allowlist, Audit Logs, Data Residency, Custom Domain Email Notification and Mobile Application Management. Yet, users were able to access and view these features. During the incident, emails to admins experienced a delay. There was degraded experience when creating and deleting API tokens. Statuspage was largely unaffected. However, notification workers could not scale up and communications to customers were delayed, though they could be replayed later. The incident also impacted users trying to sign in to manage portals and private pages. Compass experienced a minor impact on its ability to write to its primary database store. No core features were affected. Atlassian's customers could have experienced stale data issues in production, US-EAST-1 for ~30s, against expected 5s at p99, because of delayed token resolution. The provisioning of new cloud tenants was also impacted until the recovery of the services. ### **ROOT CAUSE** The issue was caused by a problem with several network devices within AWS’s internal network. These devices were receiving more traffic than they were able to process, which led to elevated latency and packet loss. As a result, it affected multiple AWS services which Atlassian's platform relies on, causing service degradation and disruption to the products mentioned above. For more information in regards to the root cause, see [Summary of the AWS Service Event in the Northern Virginia \(US-EAST-1\) Region](https://aws.amazon.com/message/12721). There were no relevant Atlassian-driven events in the lead-up that have been identified to cause or contribute to this incident. ### **REMEDIAL ACTIONS PLAN & NEXT STEPS** We know that outages impact your productivity. We are taking immediate steps to improve the Atlassian platform's resiliency and availability to reduce the impact of such an event in the future. While Atlassian's Cloud services do run in several regions \(US EAST and WEST, AP, EU CENTRAL and WEST, among others\) and data is replicated across several regions to increase the resilience against outages of this magnitude, we have identified and are taking actions that include improvements to our region failover process. This will minimize the impact of future outages on Atlassian’s Cloud services and provide better support for our customers. We are prioritizing the following actions to avoid repeating this type of incident: * Enhance and strengthen our plans for cross-region resiliency and disaster recovery plans, including: continue practicing region failover in production, investigate and implement better resilience strategies for services, Active/Active or Active/Passive. * Improving and adopting multi-region architecture for services that do require it. * Excercise wargaming scenarios that will simulate this outage to assess customer view of the incident. This will allow us to create further action items to improve our region failover process. We apologize to customers whose services were impacted during this incident. Thanks, Atlassian Customer Support
Status: Postmortem
Impact: None | Started At: Dec. 7, 2021, 5:39 p.m.
Description: Between 22:00 UTC to 00:00 UTC, we experienced a delay in processing for some activation and provisioning of new instances for Confluence, Jira Work Management, Jira Service Management, Jira Software, and Atlassian Access. The issue has been resolved and the service is operating normally. The small number of delayed requests are going to be processed shortly.
Status: Resolved
Impact: Major | Started At: Sept. 7, 2021, 10:59 p.m.
Description: Between 22:00 UTC to 00:00 UTC, we experienced a delay in processing for some activation and provisioning of new instances for Confluence, Jira Work Management, Jira Service Management, Jira Software, and Atlassian Access. The issue has been resolved and the service is operating normally. The small number of delayed requests are going to be processed shortly.
Status: Resolved
Impact: Major | Started At: Sept. 7, 2021, 10:59 p.m.
Description: Between June, 24th 2021 13:09 UTC to June, 29th 2021 18:16 UTC, we experienced issues with relay states configured in Atlassian Access. The issue has been resolved and the service is operating normally.
Status: Resolved
Impact: Minor | Started At: June 29, 2021, 4:38 p.m.
Join OutLogger to be notified when any of your vendors or the components you use experience an outage or down time. Join for free - no credit card required.