Last checked: a minute ago
Get notified about any outages, downtime or incidents for Atlassian Analytics and 1800+ other cloud vendors. Monitor 10 companies, for free.
Outage and incident data over the last 30 days for Atlassian Analytics.
Join OutLogger to be notified when any of your vendors or the components you use experience an outage. It's completely free and takes less than 2 minutes!
Sign Up NowOutlogger tracks the status of these components for Xero:
Component | Status |
---|---|
Atlassian Data Lake | Active |
Dashboards | Active |
Third party data connections | Active |
View the latest incidents for Atlassian Analytics and check for official updates:
Description: Between 03-07-2024 20:08 UTC to 03-07-2024 20:31 UTC, we experienced downtime for Atlassian Analytics. The issue has been resolved and the service is operating normally.
Status: Resolved
Impact: None | Started At: July 3, 2024, 8:51 p.m.
Description: ### Summary On June 3rd, between 09:43pm and 10:58 pm UTC, Atlassian customers using multiple product\(s\) were unable to access their services. The event was triggered by a change to the infrastructure API Gateway, which is responsible for routing the traffic to the correct application backends. The incident was detected by the automated monitoring system within five minutes and mitigated by correcting a faulty release feature flag, which put Atlassian systems into a known good state. The first communications were published on the Statuspage at 11:11pm UTC. The total time to resolution was about 75 minutes. ### **IMPACT** The overall impact was between 09:43pm and 10:17pm UTC, with the system initially in a degraded state, followed by a total outage between 10:17pm and 10:58pm UTC. _The Incident caused service disruption to customers in all regions and affected the following products:_ * Jira Software * Jira Service Management * Jira Work Management * Jira Product Discovery * Jira Align * Confluence * Trello * Bitbucket * Opsgenie * Compass ### **ROOT CAUSE** A policy used in the infrastructure API gateway was being updated in production via a feature flag. The combination of an erroneous value entered in a feature flag, and a bug in the code resulted in the API Gateway not processing any traffic. This created a total outage, where all users started receiving 5XX errors for most Atlassian products. Once the problem was identified and the feature flag updated to the correct values, all services started seeing recovery immediately. ### **REMEDIAL ACTIONS PLAN & NEXT STEPS** We know that outages impact your productivity. While we have several testing and preventative processes in place, this specific issue wasn’t identified because the change did not go through our regular release process and instead was incorrectly applied through a feature flag. We are prioritizing the following improvement actions to avoid repeating this type of incident: * Prevent high-risk feature flags from being used in production * Improve the policy changes testing * Enforcing longer soak time for policy changes * Any feature flags should go through progressive rollouts to minimize broad impact * Review the infrastructure feature flags to ensure they all have appropriate defaults * Improve our processes and internal tooling to provide faster communications to our customers We apologize to customers whose services were affected by this incident and are taking immediate steps to address the above gaps. Thanks, Atlassian Customer Support
Status: Postmortem
Impact: None | Started At: June 3, 2024, 11:11 p.m.
Description: The missing data issue with the team and group tables has been fixed.
Status: Resolved
Impact: Major | Started At: May 30, 2024, 8:02 p.m.
Description: This incident has been resolved.
Status: Resolved
Impact: Major | Started At: May 13, 2024, 7:21 p.m.
Description: ### Summary On April 15, 2024, between 2:00 and 3:00 AM UTC, Atlassian customers using Atlassian Analytics in the us-east-1 region encountered inconsistencies when querying jira\_issue data. Subsequent investigation revealed that a bug in our workflow during an internal data migration on the Data Lake caused a backlog of migrated data to accumulate, leading to incomplete data being presented to customers. Our Data Lake monitoring system can usually identify such issues within 30 minutes, but the high volume of migrations in a short timeframe prolonged the processing time. This led to an incident escalation. The issue was resolved by scaling up infrastructure to process the backlog of data. Subsequently, additional inconsistencies in data were discovered across Account, Devops, and Jira tables. The impacted data was reinstated from the source and reprocessed into the Data Lake. Despite our team's diligent efforts, it took us longer to resolve the issue due to the extensive scale of affected data. We are taking remedial actions to enhance data quality checks, along with improving tooling for quicker recovery and progressive deployments to minimize widespread impact ### **IMPACT** Between April 15, 2024, at 02:00 AM UTC and April 20 at 21:00 PM UTC, some Atlassian Analytics customers experienced service degradation. This led to inconsistencies and incomplete data for the Account, DevOps, and Jira tables in their Atlassian Analytics dashboards. ### **ROOT CAUSE** The problem arose due to a change made to enhance the internal data partitioning structure, aimed at improving performance for upcoming features. The fundamental issue stemmed from a bug in the workflows, causing data to be presented to customers before the accumulated backlog had been processed. Consequently, users of Atlassian Analytics encountered incomplete or missing data in their dashboards. Although comprehensive testing was conducted prior to deployment in the production environment, this problem arose as an exceptional scenario when dealing with significantly larger volumes of migrated data, resulting in customers seeing the migrated data prior to the completion of data processing. Subsequent validation revealed that further inconsistencies were caused by another bug in the compaction process of raw data, resulting in the selection of incorrect versions of records. This erroneous data was utilized in the aforementioned partition update process, contributing to the inconsistencies. The solution involved replicating all affected data from the source and reprocessing it within the Data Lake. ### **REMEDIAL ACTIONS PLAN & NEXT STEPS** We understand that data inconsistencies can significantly affect your productivity. To prevent this kind of incident from happening again, we plan to focus on the following measures: * Enhance our existing data quality tests and expand their scope to identify these issues earlier. * Enhance our tools to ensure quicker recovery in similar future incidents. * Progressively deploy \(by cloud region\) our changes to minimize widespread impact. We apologize to customers whose services were impacted during this incident; we are taking immediate steps to improve the platform’s reliability and data accuracy. Thanks, Atlassian Customer Support
Status: Postmortem
Impact: Major | Started At: April 15, 2024, 5:24 p.m.
Join OutLogger to be notified when any of your vendors or the components you use experience an outage or down time. Join for free - no credit card required.