Last checked: 3 minutes ago
Get notified about any outages, downtime or incidents for Atlassian Partners and 1800+ other cloud vendors. Monitor 10 companies, for free.
Outage and incident data over the last 30 days for Atlassian Partners.
Join OutLogger to be notified when any of your vendors or the components you use experience an outage. It's completely free and takes less than 2 minutes!
Sign Up NowOutlogger tracks the status of these components for Xero:
Component | Status |
---|---|
Enablement Academy | Active |
Partner Business Center | Active |
Partner Directory | Active |
Partner Portal | Active |
Partner Portal dashboard | Active |
Partner Purchasing Center | Active |
Partner Support Portal | Active |
View the latest incidents for Atlassian Partners and check for official updates:
Description: Between Oct 01, 2024 - 17:49 UTC and Oct 01, 2024 - 19:42 UTC we identified a temporary outage with Atlassian Partner products. All affected products are now back online and no further impact has been observed.
Status: Resolved
Impact: Critical | Started At: Oct. 1, 2024, 5:49 p.m.
Description: We mistakenly believed there was impact to these services. The service is operating normally.
Status: Resolved
Impact: Critical | Started At: July 3, 2024, 8:51 p.m.
Description: We have identified that this behavior existed for long time and is due to a bug triggered in edge scenario when following sequence of events happens: 1. First a paid app gets suspended due to lack of payment 2. Then administrator decides to uninstall app 3. Then administrator decides to (re)install app 4. Payment is issued for the app 5. App fails to reactivate (as a result of this bug) We are advising customers to raise a support ticket with Atlassian to restore app functionality when faced with such scenario. The app reactivates correctly when no re-installation was attempted after suspension. We haven't found any recent changes that could cause this scenario to be happening more frequently. We're going to work in the future on improving the reinstallation flow for the suspended apps which will reduce the possibility to leave apps in state whey they can't be reactivated.
Status: Resolved
Impact: None | Started At: June 12, 2024, 6:45 a.m.
Description: ### **SUMMARY** On Sep 13, 2023, between 12:00 PM UTC and 03: 30 PM UTC, some Atlassian users were unable to sign in to their accounts and use multiple Atlassian cloud products. The event was triggered by a misconfiguration of rate limits in an internal service which caused a cascading failure in sign-in and signup-related APIs. The incident was quickly detected by multiple automated monitoring systems. The incident was mitigated on Sep 13, 2023, 03: 30 PM UTC by the rollback of a feature and additional scaling of services which put Atlassian systems into a known good state. The total time to resolution was about 3 hours & 30 minutes. ### **IMPACT** The overall impact was between Sep 13, 2023, 12:00 PM UTC and Sep 13, 2023, 03: 30 PM UTC on multiple products. The Incident caused intermittent service disruption across all regions. Some users were unable to sign in for sessions. Other scenarios that temporarily failed were new user signups, profile retrieval, and password reset. During the incident we had a peak of 90% requests failing across authentication, user profile retrieval, and password reset use cases. ### **ROOT CAUSE** The issue was caused due to a misconfiguration of a rate limit in an internal core service. As a result, some sign-in requests over the limit received HTTP 429 errors. However, retry behavior for requests caused a multiplication of load which led to higher service degradation. As many internal services depend on each other, the call graph complexity led to a longer time to detect the actual faulty service. ### **REMEDIAL ACTIONS PLAN & NEXT STEPS** We are continuously improving our system's resiliency. We are prioritizing the following improvement actions to avoid repeating this type of incident: * Audit and improve service rate limits and client retry and backoff behavior. * Improve scale and load test automation for complex service interactions. * Audit cross-service dependencies and minimize them where possible related to sign-in flows. Due to the unavailability of sign-in, some customers were unable to create support tickets. We are making additional process improvements to: * Enable our unauthenticated support contact form and notify users that it should be used when standard channels are not available. * Create status page notifications more quickly and ensure that for severe incidents, notifications to all subscribers are enabled. We apologize to users who were impacted during this incident; we are taking immediate steps to improve the platform’s reliability and availability. Thanks, Atlassian Customer Support
Status: Postmortem
Impact: None | Started At: Sept. 13, 2023, 2:08 p.m.
Description: ### **SUMMARY** We understand the importance of providing reliable and consistent service to our valued customers. On July 6, 2023, from 03:52 to 15:11 UTC, we experienced an issue with an upgraded version of a third-party tool that functions as our internal artifact management system. Despite our monitoring system identifying the incident within two minutes, this issue led to the degradation of the scaling capabilities of our internal hosting platform, resulting in service degradation or outages for customers of Atlassian cloud. In response to this situation, we are taking immediate measures to enhance the stability of our system and prevent similar issues from re-occurring. ### **IMPACT** This incident affected multiple regions and products due to the diminished scaling capabilities of our internal hosting platform. In most products and offerings, customers faced reduced functionality, slower response times, and limited access to specific features. ### **ROOT CAUSE** The root cause of the incident was the introduction of new functionality in a third-party tool that functions as our internal artifact management system. It led to an unexpected increase in the load on the primary database of the artifact system. Upon identifying and localizing the problem, we promptly adjusted the system configuration to regain stability. ### **REMEDIAL ACTIONS PLAN & NEXT STEPS** Over the next months, we will enact a temporary freeze on non-critical upgrades of the artifact management system, and we will focus our efforts on three high-priority initiatives: 1. **Enhancing system scaling:** We prioritized work ensuring that downtime in a critical infrastructure component does not affect the scaling of other components. We expect to complete this initiative within the next two months. 2. **Reducing interdependencies:** We are working to mitigate the risk of potential cascading failures by ensuring that significant system components are able to operate independently in the case of issues. Initiatives 1 and 2 are already in progress but have been given priority to be completed as soon as possible. 3. **Strengthening testing procedures:** Alongside these initiatives, we are addressing the need for even more stringent testing procedures than we already have in place to prevent potential issues in future updates. We are committed to collaborating closely with our technology partners to ensure the most optimal experience for our customers. We apologize for any inconvenience caused by this incident and appreciate your understanding. Our team is dedicated to continually improving our systems and processes to provide you with the exceptional service you deserve. Thank you for your continued support and trust in us. Sincerely, Atlassian Customer Support
Status: Postmortem
Impact: None | Started At: July 6, 2023, 11:18 a.m.
Join OutLogger to be notified when any of your vendors or the components you use experience an outage or down time. Join for free - no credit card required.