UiPath Status: Check if UiPath down or having an outage.

UiPath outages and incidents

Outage and incident data over the last 30 days for UiPath.

There have been 9 outages or incidents for UiPath in the last 30 days.

Severity Breakdown:

None: 1

Minor: 2

Major: 5

Critical: 1

Tired of searching for status updates?

Join OutLogger to be notified when any of your vendors or the components you use experience an outage. It's completely free and takes less than 2 minutes!

Components and Services Monitored for UiPath

Outlogger tracks the status of these components for Xero:

Action Center Active

AI Center Active

Apps Active

Automation Cloud Active

Automation Hub Active

Automation Ops Active

Autopilot for Everyone Active

Cloud Robots - VM Active

Communications Mining Active

Computer Vision Active

Context Grounding Active

Customer Portal Active

Data Service Active

Documentation Portal Active

Document Understanding Active

Insights Active

Integration Service Active

Marketplace Active

Orchestrator Active

Process Mining Active

Serverless Robots Active

Solutions Management Active

Studio Web Active

Task Mining Active

Test Manager Active

Component	Status
Action Center	Active
AI Center	Active
Apps	Active
Automation Cloud	Active
Automation Hub	Active
Automation Ops	Active
Autopilot for Everyone	Active
Cloud Robots - VM	Active
Communications Mining	Active
Computer Vision	Active
Context Grounding	Active
Customer Portal	Active
Data Service	Active
Documentation Portal	Active
Document Understanding	Active
Insights	Active
Integration Service	Active
Marketplace	Active
Orchestrator	Active
Process Mining	Active
Serverless Robots	Active
Solutions Management	Active
Studio Web	Active
Task Mining	Active
Test Manager	Active

Latest UiPath outages and incidents.

View the latest incidents for UiPath and check for official updates:

Customers will see issues accessing automation cloud from EastUs region

Description: ## Customer impact From 2024-04-03 23:45 UTC to 2024-04-04 02:35 UTC our customers experienced errors when accessing some of the services located in the US region of Automation Cloud. Impacted products include Automation Cloud, Orchestrator, Automation Hub, Automation Ops, Document Understanding, Serverless Robots, Cloud Robots - VM, Solutions Management, and Insights. ## Root cause UiPath makes extensive use of Azure SQL. At the beginning of the outage, Microsoft performed routine SQL maintenance in the East US region. Typically this is done without any visible impact to our customers. But for some reason this maintenance caused the SQL Databases in this region to become unavailable. We are still waiting for a root cause from Microsoft and will update this document once we receive it. ## Detection Automated alerts immediately detected the issue and notified UiPath on-call engineers. They confirmed the scope of the outage and updated [status.uipath.com](http://status.uipath.com/). ## Response After a brief investigation, we determined that the problem was with Azure SQL. We reached out to Microsoft Support to request assistance. For the US region of all UiPath products, we place the primary database in Azure’s East US region and a failover database in Azure’s West US region. By default, Azure will failover from primary to secondary after the primary is unavailable for 60 minutes. During this incident, most databases automatically failed to the secondary region. Unfortunately, the Orchestrator, Automation Hub and Insights databases did not. The UiPath engineers investigated the databases and began to trigger a manual failover, but by that time Microsoft had resolved the underlying issue in the East US region. ## Follow up * Work with Microsoft to get a root cause for the underlying Azure SQL outage. * Determine why Orchestrator, Automation Hub and Insights did not failover to the secondary region. Perform a failover drill to confirm the problem has been fixed. * Investigate if the automatic failover period can be reduced from 60 minutes.

Status: Postmortem

Impact: Major | Started At: April 4, 2024, 12:02 a.m.

Updates:

Time: April 4, 2024, 4:27 p.m.

Status: Postmortem

Update: ## Customer impact From 2024-04-03 23:45 UTC to 2024-04-04 02:35 UTC our customers experienced errors when accessing some of the services located in the US region of Automation Cloud. Impacted products include Automation Cloud, Orchestrator, Automation Hub, Automation Ops, Document Understanding, Serverless Robots, Cloud Robots - VM, Solutions Management, and Insights. ## Root cause UiPath makes extensive use of Azure SQL. At the beginning of the outage, Microsoft performed routine SQL maintenance in the East US region. Typically this is done without any visible impact to our customers. But for some reason this maintenance caused the SQL Databases in this region to become unavailable. We are still waiting for a root cause from Microsoft and will update this document once we receive it. ## Detection Automated alerts immediately detected the issue and notified UiPath on-call engineers. They confirmed the scope of the outage and updated [status.uipath.com](http://status.uipath.com/). ## Response After a brief investigation, we determined that the problem was with Azure SQL. We reached out to Microsoft Support to request assistance. For the US region of all UiPath products, we place the primary database in Azure’s East US region and a failover database in Azure’s West US region. By default, Azure will failover from primary to secondary after the primary is unavailable for 60 minutes. During this incident, most databases automatically failed to the secondary region. Unfortunately, the Orchestrator, Automation Hub and Insights databases did not. The UiPath engineers investigated the databases and began to trigger a manual failover, but by that time Microsoft had resolved the underlying issue in the East US region. ## Follow up * Work with Microsoft to get a root cause for the underlying Azure SQL outage. * Determine why Orchestrator, Automation Hub and Insights did not failover to the secondary region. Perform a failover drill to confirm the problem has been fixed. * Investigate if the automatic failover period can be reduced from 60 minutes.
Time: April 4, 2024, 2:36 a.m.

Status: Resolved

Update: Issue is resolved and we are continuously monitoring our services. Marking the status as resolved.
Time: April 4, 2024, 2:04 a.m.

Status: Monitoring

Update: We are seeing improvements in health of the databases. We are continusoly monitoring the status
Time: April 4, 2024, 2:02 a.m.

Status: Investigating

Update: We are continuing to investigate this issue.
Time: April 4, 2024, 1:19 a.m.

Status: Investigating

Update: We are engaging the Microsoft and continuously investigating but at this time we are waiting for the issue to resolve from Microsoft Azure SQL backend
Time: April 4, 2024, 1:18 a.m.

Status: Investigating

Update: We are engaging the Microsoft and continuously investigating but at this time we are waiting for the issue to resolve from Microsoft Azure SQL backend
Time: April 4, 2024, 12:54 a.m.

Status: Investigating

Update: We are continuing to investigate this issue.
Time: April 4, 2024, 12:53 a.m.

Status: Investigating

Update: We are continuing to investigate this issue.
Time: April 4, 2024, 12:40 a.m.

Status: Investigating

Update: We are continuing the investigation and we are also seeing some issues on the backend Azure sql from cloud provider.
Time: April 4, 2024, 12:09 a.m.

Status: Investigating

Update: We are continuing to investigate this issue.
Time: April 4, 2024, 12:06 a.m.

Status: Investigating

Update: We are continuing to investigate this issue.
Time: April 4, 2024, 12:02 a.m.

Status: Investigating

Update: We are seeing increasing number of 503 errors from cloudflare . we are further investigating the issue from our backend applications and monitoring

AI Center: ML Skills services are unavailable for customers in Australia & Canada region

Description: This incident has been resolved.

Status: Resolved

Impact: Critical | Started At: March 28, 2024, 5:16 a.m.

Updates:

Time: March 28, 2024, 8:40 a.m.

Status: Resolved

Update: This incident has been resolved.
Time: March 28, 2024, 7:11 a.m.

Status: Monitoring

Update: We have successfully mitigated the issue affecting AI Center ML Skills and the creation of new data labelling sessions in AI Center. The disruption was caused by a discrepancy in secret refresh between the service and identity provider; impacting customer authentication and inter-service communication. Our team has rectified the issue and restored normal functionality. We apologize for any inconvenience this may have caused and thank you for your understanding.
Time: March 28, 2024, 6:05 a.m.

Status: Monitoring

Update: A fix has been implemented and we are monitoring the results.
Time: March 28, 2024, 5:16 a.m.

Status: Investigating

Update: We discovered a problem where ML Skills in AI Center is failing to return results for users in Australia & Canada region. Our engineers are investigating this further.

Task Mining: Notification Updates are not available

Description: The fix has been applied on all geos and all communications were restored now.

Status: Resolved

Impact: Minor | Started At: March 25, 2024, 11:47 a.m.

Updates:

Time: March 25, 2024, 3:15 p.m.

Status: Resolved

Update: The fix has been applied on all geos and all communications were restored now.
Time: March 25, 2024, 2:32 p.m.

Status: Identified

Update: The issue is resolved now for our customers in Australia, Singapore and Japan. The fix is currently further being deployed in the rest of the geos. We will keep you posted with the updates.
Time: March 25, 2024, 1:23 p.m.

Status: Identified

Update: We are continuing to work on the fix for the issue. ETA for mitigation is around 1 hours across all geos.
Time: March 25, 2024, 11:47 a.m.

Status: Identified

Update: Notification updates in Task Mining are not working. We have identified the issue and are working on applying the fix for it. The ETA for mitigation is around 2 hours across all geos.

Communications mining outage in EU region- available only in readonly mode

Description: # Background UiPath Communications Mining is deployed globally across multiple regions. Each region is independent of all others with independent deployments of databases and stateless services. Multiple, different distributed database solutions are deployed in each region for different purposes. Historically we used a strongly consistent, horizontally scalable document store, for most ground-truth data storage, but due to a variety of reasons, including operational concerns relevant to this outage, in the last year we've been migrating away from this store to a distributed SQL database, instead. But today, much of our data \(~1B rows, ~5 TiB\) is still stored in this legacy document store. # Customer Impact * Performance degradation and elevated error rates \(HTTP 500 error codes\) for tenants in the EU region starting at the weekend on Mar 16, 10:02 UTC and Mar 18. * From Monday, Mar 18, 11:37 UTC, analytics and UI fully back up, but training, ingestion and streams continued to experience issues. * All functionality was fully restored on Wednesday, Mar 20 at 10:20 UTC. * 35 tenants in the EU were affected, and no tenants in other regions were impacted. # Root Cause The outage was caused by an interaction of multiple issues. At its core, however, the incident was triggered by a manual scaling operation that started on Saturday, Mar 16th that exposed fundamental problems in our legacy document store: 1. Explicit table re-sharding causes a temporary reduction in fault tolerance 2. Unexpected memory mapped page count exhaustion caused multiple DB nodes to crash simultaneously 3. Kubernetes security controls, read-only filesystems with unprivileged containers prevented in-place updates to sysctl-s, requiring further DB restarts to increase memory mapped page limits \(vm.max\_map\_count\) 4. Crashes exposed flaws in our document store fail-over mechanism, causing nodes to enter a "viral" state where failover nodes also entered a backfilling state. 5. Eventual solution was manually re-creating a subset of the database tables and repopulating them with data from the old, now read-only tables. 6. New tables suffered from very slow secondary index reconstruction performance issues in our document store # Detection Due to increased usage in the EU region, we started scaling up our document store cluster on Jan 30. We added two new nodes and started re-sharding and moving tables to the new nodes over the next month and a half, during weekends to avoid customer impact. Until the weekend of Mar 16, these operations all completed without a hitch. As soon as we started re-sharding one of the only two remaining tables at 10:02 UTC on Mar 16, two database nodes crashed simultaneously due to exhaustion of memory mapped pages \(`vm.max_map_count`\). There was on-call engineer actively monitoring the process at the time, but the issue was also picked up within minutes by our automated alerts. # Response Since all our workloads run on read-only, unprivileged containers, increasing this limit is impossible without restarting all the nodes. So the focus was bringing the cluster into a fully replicated state so we could then run a controlled restart to increase the `max_map_count` on all the nodes. Because of the hard crash during a re-sharding operation, the database started exhibiting unexpected behaviour and entered a degraded state and would sporadically enter read-only states and would not accept writes before a full integrity check. Furthermore the recovery process did not seem to ever fully complete. By Sunday evening a sufficient number of replicas became available. Our automatic nightly backup process started on 23:00 UTC Sunday, Mar 17 which added sufficient load to the database such that it experienced another four node crashes between 01:00 to 06:00 on Monday, Mar 18, due to `max_map_count` exhaustion. The DB reverted to the same degraded state as above, with very lengthy automated "backfilling" processes that never completed and during which the DB entered read-only mode. Due to the risk of further crashes before recovery is complete, on 08:42 UTC Monday, Mar 18 we made the decision to go ahead with the controlled restart to increase `max_map_count`, even though the database was not in a fully recovered state. This resulted many additional hours of downtime, but in exchange gave us the confidence that it would be able to complete successfully without additional unexpected crashes. By 11:37 UTC Monday, Mar 18 all but two tables were fully available allowing us to restore most functionality. The remaining two \(very large\) tables failed to recover through the automated process multiple times. We rapidly built and after significant testing and iteration, deployed an emergency batch job at 04:20 UTC Tuesday, Mar 19. This created new tables and copied all rows into fresh tables, while maintaining availability of the rest of the product. This process eventually completed at 07:30 UTC Tuesday, after which we could start rebuilding the secondary indexes in the new tables. Reindexing ~200M rows in these tables took over 24h, finally completing at 10:20 UTC, Wednesday, Mar 20, restoring all functionality. # Follow-ups This is the most significant outage UiPath Communications Mining has ever experienced and it was caused by the one of our core data stores. We had been aware of issues with this document store, and have been migrating away from it slowly over the last year. The next steps are 1. Halt further scaling of the document store. The number of replicas today can handle current and forecasted load for at least another year. We know the database is resilient in its steady state. 2. Reduce the amount of data stored in our database by more aggressively garbage collecting old data, and moving larger objects into blob storage that are referenced from the database instead. 3. Reprioritise the migration away from this legacy store as critical, aiming to complete it in the next six months, and start with database tables that caused most problems during this incident first.

Status: Postmortem

Impact: Critical | Started At: March 18, 2024, 9:18 a.m.

Updates:

Time: April 9, 2024, 4:06 p.m.

Status: Postmortem

Update: # Background UiPath Communications Mining is deployed globally across multiple regions. Each region is independent of all others with independent deployments of databases and stateless services. Multiple, different distributed database solutions are deployed in each region for different purposes. Historically we used a strongly consistent, horizontally scalable document store, for most ground-truth data storage, but due to a variety of reasons, including operational concerns relevant to this outage, in the last year we've been migrating away from this store to a distributed SQL database, instead. But today, much of our data \(~1B rows, ~5 TiB\) is still stored in this legacy document store. # Customer Impact * Performance degradation and elevated error rates \(HTTP 500 error codes\) for tenants in the EU region starting at the weekend on Mar 16, 10:02 UTC and Mar 18. * From Monday, Mar 18, 11:37 UTC, analytics and UI fully back up, but training, ingestion and streams continued to experience issues. * All functionality was fully restored on Wednesday, Mar 20 at 10:20 UTC. * 35 tenants in the EU were affected, and no tenants in other regions were impacted. # Root Cause The outage was caused by an interaction of multiple issues. At its core, however, the incident was triggered by a manual scaling operation that started on Saturday, Mar 16th that exposed fundamental problems in our legacy document store: 1. Explicit table re-sharding causes a temporary reduction in fault tolerance 2. Unexpected memory mapped page count exhaustion caused multiple DB nodes to crash simultaneously 3. Kubernetes security controls, read-only filesystems with unprivileged containers prevented in-place updates to sysctl-s, requiring further DB restarts to increase memory mapped page limits \(vm.max\_map\_count\) 4. Crashes exposed flaws in our document store fail-over mechanism, causing nodes to enter a "viral" state where failover nodes also entered a backfilling state. 5. Eventual solution was manually re-creating a subset of the database tables and repopulating them with data from the old, now read-only tables. 6. New tables suffered from very slow secondary index reconstruction performance issues in our document store # Detection Due to increased usage in the EU region, we started scaling up our document store cluster on Jan 30. We added two new nodes and started re-sharding and moving tables to the new nodes over the next month and a half, during weekends to avoid customer impact. Until the weekend of Mar 16, these operations all completed without a hitch. As soon as we started re-sharding one of the only two remaining tables at 10:02 UTC on Mar 16, two database nodes crashed simultaneously due to exhaustion of memory mapped pages \(`vm.max_map_count`\). There was on-call engineer actively monitoring the process at the time, but the issue was also picked up within minutes by our automated alerts. # Response Since all our workloads run on read-only, unprivileged containers, increasing this limit is impossible without restarting all the nodes. So the focus was bringing the cluster into a fully replicated state so we could then run a controlled restart to increase the `max_map_count` on all the nodes. Because of the hard crash during a re-sharding operation, the database started exhibiting unexpected behaviour and entered a degraded state and would sporadically enter read-only states and would not accept writes before a full integrity check. Furthermore the recovery process did not seem to ever fully complete. By Sunday evening a sufficient number of replicas became available. Our automatic nightly backup process started on 23:00 UTC Sunday, Mar 17 which added sufficient load to the database such that it experienced another four node crashes between 01:00 to 06:00 on Monday, Mar 18, due to `max_map_count` exhaustion. The DB reverted to the same degraded state as above, with very lengthy automated "backfilling" processes that never completed and during which the DB entered read-only mode. Due to the risk of further crashes before recovery is complete, on 08:42 UTC Monday, Mar 18 we made the decision to go ahead with the controlled restart to increase `max_map_count`, even though the database was not in a fully recovered state. This resulted many additional hours of downtime, but in exchange gave us the confidence that it would be able to complete successfully without additional unexpected crashes. By 11:37 UTC Monday, Mar 18 all but two tables were fully available allowing us to restore most functionality. The remaining two \(very large\) tables failed to recover through the automated process multiple times. We rapidly built and after significant testing and iteration, deployed an emergency batch job at 04:20 UTC Tuesday, Mar 19. This created new tables and copied all rows into fresh tables, while maintaining availability of the rest of the product. This process eventually completed at 07:30 UTC Tuesday, after which we could start rebuilding the secondary indexes in the new tables. Reindexing ~200M rows in these tables took over 24h, finally completing at 10:20 UTC, Wednesday, Mar 20, restoring all functionality. # Follow-ups This is the most significant outage UiPath Communications Mining has ever experienced and it was caused by the one of our core data stores. We had been aware of issues with this document store, and have been migrating away from it slowly over the last year. The next steps are 1. Halt further scaling of the document store. The number of replicas today can handle current and forecasted load for at least another year. We know the database is resilient in its steady state. 2. Reduce the amount of data stored in our database by more aggressively garbage collecting old data, and moving larger objects into blob storage that are referenced from the database instead. 3. Reprioritise the migration away from this legacy store as critical, aiming to complete it in the next six months, and start with database tables that caused most problems during this incident first.
Time: March 20, 2024, 10:33 a.m.

Status: Resolved

Update: We've validated all functionalities and the incident is now fully resolved.
Time: March 20, 2024, 10:22 a.m.

Status: Monitoring

Update: The index restoration is completed. We are conducting tests to ensure that all functionality has been successfully restored.
Time: March 20, 2024, 10:14 a.m.

Status: Identified

Update: The database index reconstruction process is taking longer than anticipated. We now expect it to require a few additional (approx. 2) hours to complete. We will provide updates as we gain more insights into the situation.
Time: March 19, 2024, 5:33 p.m.

Status: Identified

Update: Database index reconstruction is significantly slower than previously expected. Now expecting full availability at midnight UTC tonight.
Time: March 19, 2024, 7:47 a.m.

Status: Identified

Update: The issue is partially mitigated and it would take around 3 hours for a complete recovery
Time: March 18, 2024, 6:23 p.m.

Status: Identified

Update: Email ingestion functionality restored, new data can be added. Streams functionality is still down.
Time: March 18, 2024, 4:12 p.m.

Status: Identified

Update: Database is currently running through backfill process. We estimate this may take multiple hours to fully catch up.
Time: March 18, 2024, 11:37 a.m.

Status: Identified

Update: Issue is partially mitigated: 1. Analytics functionality restored 2. Most of the UI now operates correctly Ingestion and streams functionality still down Mitigation is in progress
Time: March 18, 2024, 10:22 a.m.

Status: Identified

Update: Issue is identified, mitigation is in progress
Time: March 18, 2024, 9:18 a.m.

Status: Investigating

Update: We are experiencing an issue with communications mining and investigation is in progress Currently the service is only available in read only mode

UiPath services impacted

Description: This incident has been resolved.

Status: Resolved

Impact: Major | Started At: March 13, 2024, 3:31 p.m.

Updates:

Time: March 13, 2024, 3:47 p.m.

Status: Resolved

Update: This incident has been resolved.
Time: March 13, 2024, 3:37 p.m.

Status: Monitoring

Update: We have implemented a fix and all impacted services are back to operational state. We are carefully observing the services for any issue
Time: March 13, 2024, 3:31 p.m.

Status: Investigating

Update: Orchestrator, Apps and Solutions management services are impacted. We are gauging the extent of this impact and also working on a fast resolution of this issue. Please bear with us while we resolve this and get services back to Operational.

Check the status of similar companies and alternatives to UiPath

Scale AI

Systems Active

Notion

Systems Active

Brandwatch

Systems Active

Harness

Systems Active

Olive AI

Systems Active

Sisense

Systems Active

HeyJobs

Systems Active

Joveo

Systems Active

Seamless AI

Systems Active

EdCast by Cornerstone

Systems Active

hireEZ

Systems Active

Alchemy

Systems Active

Frequently Asked Questions - UiPath

Is there a UiPath outage?

The current status of UiPath is: Systems Active

Where can I find the official status page of UiPath?

The official status page for UiPath is here

How can I get notified if UiPath is down or experiencing an outage?

To get notified of any status changes to UiPath, simply sign up to OutLogger's free monitoring service. OutLogger checks the official status of UiPath every few minutes and will notify you of any changes. You can veiw the status of all your cloud vendors in one dashboard. Sign up here

What does UiPath do?

UiPath Business Automation Platform automates knowledge work, accelerating innovation and human achievement through AI and automation.

Is there an UiPath outage?

UiPath status: Systems Active

UiPath outages and incidents

There have been 9 outages or incidents for UiPath in the last 30 days.

Severity Breakdown:

Tired of searching for status updates?

Components and Services Monitored for UiPath

Latest UiPath outages and incidents.

Customers will see issues accessing automation cloud from EastUs region

Updates:

AI Center: ML Skills services are unavailable for customers in Australia & Canada region

Updates:

Task Mining: Notification Updates are not available

Updates:

Communications mining outage in EU region- available only in readonly mode

Updates:

UiPath services impacted

Updates:

Check the status of similar companies and alternatives to UiPath

Scale AI

Notion

Brandwatch

Harness

Olive AI

Sisense

HeyJobs

Joveo

Seamless AI

EdCast by Cornerstone

hireEZ

Alchemy

Frequently Asked Questions - UiPath

Is there a UiPath outage?

Where can I find the official status page of UiPath?

How can I get notified if UiPath is down or experiencing an outage?

What does UiPath do?

Start monitoring now!