Last checked: a minute ago
Get notified about any outages, downtime or incidents for Trello and 1800+ other cloud vendors. Monitor 10 companies, for free.
Outage and incident data over the last 30 days for Trello.
Join OutLogger to be notified when any of your vendors or the components you use experience an outage. It's completely free and takes less than 2 minutes!
Sign Up NowOutlogger tracks the status of these components for Xero:
Component | Status |
---|---|
API | Active |
Atlassian Support Knowledge Base | Active |
Atlassian Support - Support Portal | Active |
Atlassian Support Ticketing | Active |
Trello.com | Active |
View the latest incidents for Trello and check for official updates:
Description: ### **Summary** On February 14, 2024, between 20:05 UTC and 23:03 UTC, Atlassian customers on the following cloud products encountered a service disruption: Access, Atlas, Atlassian Analytics, Bitbucket, Compass, Confluence, Ecosystem apps, Jira Service Management, Jira Software, Jira Work Management, Jira Product Discovery, Opsgenie, StatusPage, and Trello. As part of a security and compliance uplift, we had scheduled the deletion of unused and legacy domain names used for internal service-to-service connections. Active domain names were incorrectly deleted during this event. This impacted all cloud customers across all regions. The issue was identified and resolved through the rollback of the faulty deployment to restore the domain names and Atlassian systems to a stable state. The time to resolution was two hours and 58 minutes. ### **IMPACT** External customers started reporting issues with Atlassian cloud products at 20:52 UTC. The impact of the failed change led to performance degradation or in some cases, complete service disruption. Symptoms experienced by end-users were unsuccessful page loads and/or failed interactions with our cloud products. ### **ROOT CAUSE** As part of a security and compliance uplift, we had scheduled the deletion of unused and legacy domain names that were being used for internal service-to-service connections. Active domain names were incorrectly deleted during this operation. ### **REMEDIAL ACTIONS PLAN & NEXT STEPS** We know that outages impact your productivity. The detection was delayed because existing testing & monitoring focused on service health rather than the entire system’s availability. To prevent a recurrence of this type of incident, we are implementing the following improvement measures: * Canary checks to monitor the entire system availability. * Faster rollback procedures for this type of service impact. * Stricter change control procedures for infrastructure modifications. * Migration of all DNS records to centralised management and stricter access controls on modification to DNS records. We apologize to customers whose services were impacted during this incident; we are taking immediate steps to improve the platform’s performance and availability. Thanks, Atlassian Customer Support
Status: Postmortem
Impact: None | Started At: Feb. 14, 2024, 9:57 p.m.
Description: ### Summary On Feb. 13, 2024, between 8:00 AM and 11:34 AM UTC, Trello experienced severely degraded performance appearing as a full or partial outage to Atlassian customers. The event was triggered by a buildup of long-running queries against our database, leading to slowed API response times and causing Trello to be degraded or unavailable for users. The root cause of the incident was identified as a compression change in our database deployed approximately 11 hours earlier during a low-traffic period. As European customers came online, traffic started increasing, resulting in a buildup of queries and the subsequent incident. The incident was detected by our monitoring system at 8:07 AM UTC and was mitigated by reverting the compression change and restarting components of our database system. The total time to resolution was 3 hours and 34 minutes. ### **IMPACT** The overall impact was between 8:00 AM and 11:34 AM UTC on Feb. 13, 2024. The incident caused Trello to be fully or partially unavailable for customers using or attempting to access the site during this period. ### **ROOT CAUSE** The issue was caused by a compression change in our database, which resulted in the build up of queries in the system. This build up then caused API response times to increase to critical levels. During the incident many users received HTTP 429 errors as the system began rate-limiting in an attempt to recover. Users that did not receive errors experienced API response times 10-100x slower than our standard response times. ### **REMEDIAL ACTIONS PLAN & NEXT STEPS** We know that outages impact your productivity. We are prioritizing the following actions to avoid repeating this incident and reduce time to resolution: * Improve our process for releasing incremental configuration changes which would have allowed the team to identify the root cause before a peak load period and prevent similar incidents. * Adjust priority level of alerts related to this class of incident to improve signal to noise ratio and drive faster time to resolution. We apologize to customers whose services were impacted during this incident; we are taking immediate steps to improve Trello’s performance and availability. Thanks, Atlassian Customer Support
Status: Postmortem
Impact: Critical | Started At: Feb. 13, 2024, 8:57 a.m.
Description: This incident has been resolved.
Status: Resolved
Impact: Minor | Started At: Dec. 6, 2023, 8:45 p.m.
Description: This incident has been resolved.
Status: Resolved
Impact: Minor | Started At: Dec. 6, 2023, 8:45 p.m.
Description: The systems are stable after the fix and monitoring for a specified duration
Status: Resolved
Impact: None | Started At: Dec. 4, 2023, 4:33 p.m.
Join OutLogger to be notified when any of your vendors or the components you use experience an outage or down time. Join for free - no credit card required.