Last checked: 6 minutes ago
Get notified about any outages, downtime or incidents for Trello and 1800+ other cloud vendors. Monitor 10 companies, for free.
Outage and incident data over the last 30 days for Trello.
Join OutLogger to be notified when any of your vendors or the components you use experience an outage. It's completely free and takes less than 2 minutes!
Sign Up NowOutlogger tracks the status of these components for Xero:
Component | Status |
---|---|
API | Active |
Atlassian Support Knowledge Base | Active |
Atlassian Support - Support Portal | Active |
Atlassian Support Ticketing | Active |
Trello.com | Active |
View the latest incidents for Trello and check for official updates:
Description: ### **SUMMARY** On August 30, 2023, between 4:07 and 5:30 UTC, some customers were unable to login to Atlassian's Cloud products using [id.atlassian.com](http://id.atlassian.com). Logged-in users were also unable to switch accounts, change passwords, or log out. Users with existing sessions were not impacted. Between 5:32 and 6:00 UTC, traffic was incrementally restored to a previous build, mitigating the impact for users. The total time to resolution was one hour and 53 minutes. ### **IMPACT** Users were not able to login using Atlassian's shared account management system \([id.atlassian.com](http://id.atlassian.com)\). This affected users who were trying to login to the following products: Jira, Confluence, Trello, Opsgenie, mobile apps and ecosystem apps. Aside from the inability to login, there was no impact on other Atlassian products or features. ### **ROOT CAUSE** Multiple Set-Cookie headers were unintentionally modified so that only the last Set-Cookie header remained in the response to user's browsers. The issue was caused by a change to Network Extensions within the Edge Network. As a result, users that needed a new session could not login. Upon login, the users were redirected to login again and no session was created for them. ### **REMEDIAL ACTIONS PLAN & NEXT STEPS** We know that outages impact your productivity. While we have a number of testing and preventative processes in place, this specific issue was not detected in Atlassian's staging environment. End-to-end tests did not cover the use case of multiple Set-Cookie headers in the single response and therefore this bug went unnoticed. We are prioritizing the following improvement actions to avoid repeating this type of incident: * Automated tests to be put in place to validate that cookies are not being removed from responses. * Configuration of networking extensions will be guaranteed to be identical in staging and production to ensure errors are picked up earlier. Furthermore, we typically deploy our changes progressively by cloud region to avoid broad impact, but in this case, the change was not deemed risky and was deployed to all regions. To minimize the impact of breaking changes to our environments, we will implement additional preventative measures: * Changes to network extensions in the future will use progressive rollouts. * With staging being properly utilized, errors similar to this one will not be deployed to any production environments. We apologize to customers whose services were impacted during this incident; we are taking immediate steps to improve the platform’s performance and availability. Thanks, Atlassian Customer Support
Status: Postmortem
Impact: None | Started At: Aug. 30, 2023, 5:19 a.m.
Description: ### **SUMMARY** On August 30, 2023, between 4:07 and 5:30 UTC, some customers were unable to login to Atlassian's Cloud products using [id.atlassian.com](http://id.atlassian.com). Logged-in users were also unable to switch accounts, change passwords, or log out. Users with existing sessions were not impacted. Between 5:32 and 6:00 UTC, traffic was incrementally restored to a previous build, mitigating the impact for users. The total time to resolution was one hour and 53 minutes. ### **IMPACT** Users were not able to login using Atlassian's shared account management system \([id.atlassian.com](http://id.atlassian.com)\). This affected users who were trying to login to the following products: Jira, Confluence, Trello, Opsgenie, mobile apps and ecosystem apps. Aside from the inability to login, there was no impact on other Atlassian products or features. ### **ROOT CAUSE** Multiple Set-Cookie headers were unintentionally modified so that only the last Set-Cookie header remained in the response to user's browsers. The issue was caused by a change to Network Extensions within the Edge Network. As a result, users that needed a new session could not login. Upon login, the users were redirected to login again and no session was created for them. ### **REMEDIAL ACTIONS PLAN & NEXT STEPS** We know that outages impact your productivity. While we have a number of testing and preventative processes in place, this specific issue was not detected in Atlassian's staging environment. End-to-end tests did not cover the use case of multiple Set-Cookie headers in the single response and therefore this bug went unnoticed. We are prioritizing the following improvement actions to avoid repeating this type of incident: * Automated tests to be put in place to validate that cookies are not being removed from responses. * Configuration of networking extensions will be guaranteed to be identical in staging and production to ensure errors are picked up earlier. Furthermore, we typically deploy our changes progressively by cloud region to avoid broad impact, but in this case, the change was not deemed risky and was deployed to all regions. To minimize the impact of breaking changes to our environments, we will implement additional preventative measures: * Changes to network extensions in the future will use progressive rollouts. * With staging being properly utilized, errors similar to this one will not be deployed to any production environments. We apologize to customers whose services were impacted during this incident; we are taking immediate steps to improve the platform’s performance and availability. Thanks, Atlassian Customer Support
Status: Postmortem
Impact: None | Started At: Aug. 30, 2023, 5:19 a.m.
Description: ### **SUMMARY** On Aug 4, 2023, Trello users encountered issues accessing their workspaces. This was caused by a processing error during user deletion events involving two users who shared a workspace. The error resulted in unintended workspaces being marked as deleted. The issue was identified, the deletion process halted, and data restoration initiated. The solution involved marking workspaces as undeleted and implementing a code fix to prevent similar issues in the future. ### **IMPACT** The overall impact occurred on August 4th, 2023, spanning from the afternoon to the early evening, in UTC time. All Trello workspaces created before July 2021 were inaccessible during the incident. The impact of this was 39% of active workspaces were inaccessible. ### **ROOT CAUSE** The event was triggered by a race condition which occurred during the response to user deletion events. When the last user in a workspace is deleted the system automatically marks the workspace as deleted. In this case two users sharing a workspace were deleted simultaneously, causing a race condition. The race condition triggered a code path which generated a query that was not targeted to an individual workspace, but instead marked all workspaces \(including unrelated ones\) as deleted in our database in a systematic way. ### **REMEDIAL ACTIONS PLAN & NEXT STEPS** We know that outages affect your productivity, and we are committed to preventing incidents like these from occurring. We already implemented code changes to prevent the specific condition that caused the incident. We are prioritizing the following improvement actions to avoid repeating this type of incident: * Implement a monitoring system for the following metrics in order to improve anomaly detection: CPU usage, inbound and outbound network traffic, memory usage, and disk usage. * Add anomaly detection to monitor the number of soft deletes and set up alerting for it. We apologize to customers whose services were impacted during this incident; we are taking immediate steps to improve the platform’s performance and availability. Thanks, Atlassian Customer Support
Status: Postmortem
Impact: Major | Started At: Aug. 4, 2023, 4:03 p.m.
Description: ### **SUMMARY** On Aug 4, 2023, Trello users encountered issues accessing their workspaces. This was caused by a processing error during user deletion events involving two users who shared a workspace. The error resulted in unintended workspaces being marked as deleted. The issue was identified, the deletion process halted, and data restoration initiated. The solution involved marking workspaces as undeleted and implementing a code fix to prevent similar issues in the future. ### **IMPACT** The overall impact occurred on August 4th, 2023, spanning from the afternoon to the early evening, in UTC time. All Trello workspaces created before July 2021 were inaccessible during the incident. The impact of this was 39% of active workspaces were inaccessible. ### **ROOT CAUSE** The event was triggered by a race condition which occurred during the response to user deletion events. When the last user in a workspace is deleted the system automatically marks the workspace as deleted. In this case two users sharing a workspace were deleted simultaneously, causing a race condition. The race condition triggered a code path which generated a query that was not targeted to an individual workspace, but instead marked all workspaces \(including unrelated ones\) as deleted in our database in a systematic way. ### **REMEDIAL ACTIONS PLAN & NEXT STEPS** We know that outages affect your productivity, and we are committed to preventing incidents like these from occurring. We already implemented code changes to prevent the specific condition that caused the incident. We are prioritizing the following improvement actions to avoid repeating this type of incident: * Implement a monitoring system for the following metrics in order to improve anomaly detection: CPU usage, inbound and outbound network traffic, memory usage, and disk usage. * Add anomaly detection to monitor the number of soft deletes and set up alerting for it. We apologize to customers whose services were impacted during this incident; we are taking immediate steps to improve the platform’s performance and availability. Thanks, Atlassian Customer Support
Status: Postmortem
Impact: Major | Started At: Aug. 4, 2023, 4:03 p.m.
Description: We mitigated the issue with Sign-ups, Product Activation, and Billing, and the systems are back to BAU, and all functionality is restored.
Status: Resolved
Impact: None | Started At: Aug. 2, 2023, 1:41 p.m.
Join OutLogger to be notified when any of your vendors or the components you use experience an outage or down time. Join for free - no credit card required.