Last checked: 38 seconds ago
Get notified about any outages, downtime or incidents for Atlassian Bitbucket and 1800+ other cloud vendors. Monitor 10 companies, for free.
Outage and incident data over the last 30 days for Atlassian Bitbucket.
Join OutLogger to be notified when any of your vendors or the components you use experience an outage. It's completely free and takes less than 2 minutes!
Sign Up NowOutlogger tracks the status of these components for Xero:
Component | Status |
---|---|
API | Active |
Authentication and user management | Active |
Email delivery | Active |
Git LFS | Active |
Git via HTTPS | Active |
Git via SSH | Active |
Pipelines | Active |
Purchasing & Licensing | Active |
Signup | Active |
Source downloads | Active |
Webhooks | Active |
Website | Active |
View the latest incidents for Atlassian Bitbucket and check for official updates:
Description: ### **Summary** On February 14, 2024, between 20:05 UTC and 23:03 UTC, Atlassian customers on the following cloud products encountered a service disruption: Access, Atlas, Atlassian Analytics, Bitbucket, Compass, Confluence, Ecosystem apps, Jira Service Management, Jira Software, Jira Work Management, Jira Product Discovery, Opsgenie, StatusPage, and Trello. As part of a security and compliance uplift, we had scheduled the deletion of unused and legacy domain names used for internal service-to-service connections. Active domain names were incorrectly deleted during this event. This impacted all cloud customers across all regions. The issue was identified and resolved through the rollback of the faulty deployment to restore the domain names and Atlassian systems to a stable state. The time to resolution was two hours and 58 minutes. ### **IMPACT** External customers started reporting issues with Atlassian cloud products at 20:52 UTC. The impact of the failed change led to performance degradation or in some cases, complete service disruption. Symptoms experienced by end-users were unsuccessful page loads and/or failed interactions with our cloud products. ### **ROOT CAUSE** As part of a security and compliance uplift, we had scheduled the deletion of unused and legacy domain names that were being used for internal service-to-service connections. Active domain names were incorrectly deleted during this operation. ### **REMEDIAL ACTIONS PLAN & NEXT STEPS** We know that outages impact your productivity. The detection was delayed because existing testing & monitoring focused on service health rather than the entire system’s availability. To prevent a recurrence of this type of incident, we are implementing the following improvement measures: * Canary checks to monitor the entire system availability. * Faster rollback procedures for this type of service impact. * Stricter change control procedures for infrastructure modifications. * Migration of all DNS records to centralised management and stricter access controls on modification to DNS records. We apologize to customers whose services were impacted during this incident; we are taking immediate steps to improve the platform’s performance and availability. Thanks, Atlassian Customer Support
Status: Postmortem
Impact: None | Started At: Feb. 14, 2024, 9:57 p.m.
Description: Between 15:40 UTC to 15:57 UTC customers experienced intermittent failures when searching for users in Atlassian cloud services: Confluence, Jira Work Management, Jira Service Management, Jira Software, Atlassian Bitbucket, Jira Product Discovery, and Compass. The issue has been resolved and the service is operating normally.
Status: Resolved
Impact: None | Started At: Feb. 7, 2024, 4:40 p.m.
Description: Between 15:40 UTC to 15:57 UTC customers experienced intermittent failures when searching for users in Atlassian cloud services: Confluence, Jira Work Management, Jira Service Management, Jira Software, Atlassian Bitbucket, Jira Product Discovery, and Compass. The issue has been resolved and the service is operating normally.
Status: Resolved
Impact: None | Started At: Feb. 7, 2024, 4:40 p.m.
Description: We identified a problem with the Forge hosted storage API calls, which resulted in a drop in invocation success rates in the developer console. The impact of this incident has been mitigated and our monitoring tools confirm that the success-rate is back to the pre-incident behaviour. It impacted 16 apps according to our logs, where these apps saw a reduced success rate of storage.get API calls, as listed in https://developer.atlassian.com/platform/forge/runtime-reference/storage-api-basic. As part of Forge's preparation to support Data Residency, Forge hosted storage has been undergoing a platform and data migration for storing app data. As part of this migration we do comparison checks for data consistency between the old and new platform. The previous incident earlier, https://developer.status.atlassian.com/incidents/9q71ytpjhbtl, had put the data on the new platform out of sync and so comparisons of the data from the old and new platform started showing failures and the migration logic retries on failures to test for consistency issues. This retry behaviour increased latency of these requests which led to 16 apps receiving an increased number of 504 timeout errors. Checking synchronously was identified by the team as a bug and should have been async. Once the root cause was identified we moved our backing platform rollout to a previous stage. The rollout is split into several stages. The issues we were having were on our blocking stage where we make calls to both the old and new platform and wait for both to complete so we can test any performance issues in the new platform before using it as our source of truth. It was in this blocking stage where we had a bug that included waiting on comparisons when it should've been async. To recover, we reverted back to our shadow mode stage. In this stage, all operations to the new platform are asynchronous, including comparisons that were blocking in the other stage and resulted in timeout issues and 504 errors being sent to apps. This is the state that Forge hosted storage has been in for several months without any problems. Here is the timeline of the impact: - On 2024-02-05 at 06:42 PM UTC, impact started with comparisons start happening on out of sync data in blocking mode - On 2024-02-05 at 08:57 PM UTC, impact was detected to API by our monitoring systems - On 2024-02-05 at 11:34 PM UTC, rollout to new platform was reverted to known stable state and impact ended We will release a public incident review, PIR, here in the upcoming weeks for this and the incident that happened earlier, https://developer.status.atlassian.com/incidents/9q71ytpjhbtl. We will detail all that we can about what caused the issue, and what we are doing to prevent it from happening again. We apologise for any inconveniences this may have caused our customers and the developer community and committed to preventing further issues with our hosted storage capability.
Status: Resolved
Impact: None | Started At: Feb. 6, 2024, 2:40 a.m.
Description: We identified a problem with the Forge hosted storage API calls, which resulted in a drop in invocation success rates in the developer console. The impact of this incident has been mitigated and our monitoring tools confirm that the success-rate is back to the pre-incident behaviour. It impacted 16 apps according to our logs, where these apps saw a reduced success rate of storage.get API calls, as listed in https://developer.atlassian.com/platform/forge/runtime-reference/storage-api-basic. As part of Forge's preparation to support Data Residency, Forge hosted storage has been undergoing a platform and data migration for storing app data. As part of this migration we do comparison checks for data consistency between the old and new platform. The previous incident earlier, https://developer.status.atlassian.com/incidents/9q71ytpjhbtl, had put the data on the new platform out of sync and so comparisons of the data from the old and new platform started showing failures and the migration logic retries on failures to test for consistency issues. This retry behaviour increased latency of these requests which led to 16 apps receiving an increased number of 504 timeout errors. Checking synchronously was identified by the team as a bug and should have been async. Once the root cause was identified we moved our backing platform rollout to a previous stage. The rollout is split into several stages. The issues we were having were on our blocking stage where we make calls to both the old and new platform and wait for both to complete so we can test any performance issues in the new platform before using it as our source of truth. It was in this blocking stage where we had a bug that included waiting on comparisons when it should've been async. To recover, we reverted back to our shadow mode stage. In this stage, all operations to the new platform are asynchronous, including comparisons that were blocking in the other stage and resulted in timeout issues and 504 errors being sent to apps. This is the state that Forge hosted storage has been in for several months without any problems. Here is the timeline of the impact: - On 2024-02-05 at 06:42 PM UTC, impact started with comparisons start happening on out of sync data in blocking mode - On 2024-02-05 at 08:57 PM UTC, impact was detected to API by our monitoring systems - On 2024-02-05 at 11:34 PM UTC, rollout to new platform was reverted to known stable state and impact ended We will release a public incident review, PIR, here in the upcoming weeks for this and the incident that happened earlier, https://developer.status.atlassian.com/incidents/9q71ytpjhbtl. We will detail all that we can about what caused the issue, and what we are doing to prevent it from happening again. We apologise for any inconveniences this may have caused our customers and the developer community and committed to preventing further issues with our hosted storage capability.
Status: Resolved
Impact: None | Started At: Feb. 6, 2024, 2:40 a.m.
Join OutLogger to be notified when any of your vendors or the components you use experience an outage or down time. Join for free - no credit card required.