Company Logo

Is there an Harness outage?

Harness status: Systems Active

Last checked: 2 minutes ago

Get notified about any outages, downtime or incidents for Harness and 1800+ other cloud vendors. Monitor 10 companies, for free.

Subscribe for updates

Harness outages and incidents

Outage and incident data over the last 30 days for Harness.

There have been 3 outages or incidents for Harness in the last 30 days.

Severity Breakdown:

Tired of searching for status updates?

Join OutLogger to be notified when any of your vendors or the components you use experience an outage. It's completely free and takes less than 2 minutes!

Sign Up Now

Components and Services Monitored for Harness

Outlogger tracks the status of these components for Xero:

Service Reliability Management - Error Tracking FirstGen (fka OverOps) Active
Software Engineering Insights FirstGen (fka Propelo) Active
Chaos Engineering Active
Cloud Cost Management (CCM) Active
Continuous Delivery (CD) - FirstGen - EOS Active
Continuous Delivery - Next Generation (CDNG) Active
Continuous Error Tracking (CET) Active
Continuous Integration Enterprise(CIE) - Cloud Builds Active
Continuous Integration Enterprise(CIE) - Linux Cloud Builds Active
Continuous Integration Enterprise(CIE) - Self Hosted Runners Active
Continuous Integration Enterprise(CIE) - Windows Cloud Builds Active
Custom Dashboards Active
Feature Flags (FF) Active
Infrastructure as Code Management (IaCM) Active
Internal Developer Portal (IDP) Active
Security Testing Orchestration (STO) Active
Service Reliability Management (SRM) Active
Software Engineering Insights (SEI) Active
Software Supply Chain Assurance (SSCA) Active
Chaos Engineering Active
Cloud Cost Management (CCM) Active
Continuous Delivery (CD) - FirstGen - EOS Active
Continuous Delivery - Next Generation (CDNG) Active
Continuous Error Tracking (CET) Active
Continuous Integration Enterprise(CIE) - Cloud Builds Active
Continuous Integration Enterprise(CIE) - Linux Cloud Builds Active
Continuous Integration Enterprise(CIE) - Self Hosted Runners Active
Continuous Integration Enterprise(CIE) - Windows Cloud Builds Active
Custom Dashboards Active
Feature Flags (FF) Active
Infrastructure as Code Management (IaCM) Active
Internal Developer Portal (IDP) Active
Security Testing Orchestration (STO) Active
Service Reliability Management (SRM) Active
Software Engineering Insights (SEI) Active
Software Supply Chain Assurance (SSCA) Active
Chaos Engineering Active
Cloud Cost Management (CCM) Active
Continuous Delivery (CD) - FirstGen - EOS Active
Continuous Delivery - Next Generation (CDNG) Active
Continuous Error Tracking (CET) Active
Continuous Integration Enterprise(CIE) - Cloud Builds Active
Continuous Integration Enterprise(CIE) - Linux Cloud Builds Active
Continuous Integration Enterprise(CIE) - Self Hosted Runners Active
Continuous Integration Enterprise(CIE) - Windows Cloud Builds Active
Custom Dashboards Active
Feature Flags (FF) Active
Infrastructure as Code Management (IaCM) Active
Internal Developer Portal (IDP) Active
Security Testing Orchestration (STO) Active
Service Reliability Management (SRM) Active
Software Supply Chain Assurance (SSCA) Active
Chaos Engineering Active
Cloud Cost Management (CCM) Active
Continuous Delivery - Next Generation (CDNG) Active
Continuous Error Tracking (CET) Active
Continuous Integration Enterprise(CIE) - Cloud Builds Active
Continuous Integration Enterprise(CIE) - Linux Cloud Builds Active
Continuous Integration Enterprise(CIE) - Self Hosted Runners Active
Continuous Integration Enterprise(CIE) - Windows Cloud Builds Active
Custom Dashboards Active
Feature Flags (FF) Active
Infrastructure as Code Management (IaCM) Active
Internal Developer Portal (IDP) Active
Security Testing Orchestration (STO) Active
Service Reliability Management (SRM) Active
Chaos Engineering Active
Cloud Cost Management (CCM) Active
Continuous Delivery - Next Generation (CDNG) Active
Continuous Error Tracking (CET) Active
Continuous Integration Enterprise(CIE) - Cloud Builds Active
Continuous Integration Enterprise(CIE) - Linux Cloud Builds Active
Continuous Integration Enterprise(CIE) - Self Hosted Runners Active
Continuous Integration Enterprise(CIE) - Windows Cloud Builds Active
Custom Dashboards Active
Feature Flags (FF) Active
Infrastructure as Code Management (IaCM) Active
Internal Developer Portal (IDP) Active
Security Testing Orchestration (STO) Active
Service Reliability Management (SRM) Active
Component Status
Service Reliability Management - Error Tracking FirstGen (fka OverOps) Active
Software Engineering Insights FirstGen (fka Propelo) Active
Active
Chaos Engineering Active
Cloud Cost Management (CCM) Active
Continuous Delivery (CD) - FirstGen - EOS Active
Continuous Delivery - Next Generation (CDNG) Active
Continuous Error Tracking (CET) Active
Continuous Integration Enterprise(CIE) - Cloud Builds Active
Continuous Integration Enterprise(CIE) - Linux Cloud Builds Active
Continuous Integration Enterprise(CIE) - Self Hosted Runners Active
Continuous Integration Enterprise(CIE) - Windows Cloud Builds Active
Custom Dashboards Active
Feature Flags (FF) Active
Infrastructure as Code Management (IaCM) Active
Internal Developer Portal (IDP) Active
Security Testing Orchestration (STO) Active
Service Reliability Management (SRM) Active
Software Engineering Insights (SEI) Active
Software Supply Chain Assurance (SSCA) Active
Active
Chaos Engineering Active
Cloud Cost Management (CCM) Active
Continuous Delivery (CD) - FirstGen - EOS Active
Continuous Delivery - Next Generation (CDNG) Active
Continuous Error Tracking (CET) Active
Continuous Integration Enterprise(CIE) - Cloud Builds Active
Continuous Integration Enterprise(CIE) - Linux Cloud Builds Active
Continuous Integration Enterprise(CIE) - Self Hosted Runners Active
Continuous Integration Enterprise(CIE) - Windows Cloud Builds Active
Custom Dashboards Active
Feature Flags (FF) Active
Infrastructure as Code Management (IaCM) Active
Internal Developer Portal (IDP) Active
Security Testing Orchestration (STO) Active
Service Reliability Management (SRM) Active
Software Engineering Insights (SEI) Active
Software Supply Chain Assurance (SSCA) Active
Active
Chaos Engineering Active
Cloud Cost Management (CCM) Active
Continuous Delivery (CD) - FirstGen - EOS Active
Continuous Delivery - Next Generation (CDNG) Active
Continuous Error Tracking (CET) Active
Continuous Integration Enterprise(CIE) - Cloud Builds Active
Continuous Integration Enterprise(CIE) - Linux Cloud Builds Active
Continuous Integration Enterprise(CIE) - Self Hosted Runners Active
Continuous Integration Enterprise(CIE) - Windows Cloud Builds Active
Custom Dashboards Active
Feature Flags (FF) Active
Infrastructure as Code Management (IaCM) Active
Internal Developer Portal (IDP) Active
Security Testing Orchestration (STO) Active
Service Reliability Management (SRM) Active
Software Supply Chain Assurance (SSCA) Active
Active
Chaos Engineering Active
Cloud Cost Management (CCM) Active
Continuous Delivery - Next Generation (CDNG) Active
Continuous Error Tracking (CET) Active
Continuous Integration Enterprise(CIE) - Cloud Builds Active
Continuous Integration Enterprise(CIE) - Linux Cloud Builds Active
Continuous Integration Enterprise(CIE) - Self Hosted Runners Active
Continuous Integration Enterprise(CIE) - Windows Cloud Builds Active
Custom Dashboards Active
Feature Flags (FF) Active
Infrastructure as Code Management (IaCM) Active
Internal Developer Portal (IDP) Active
Security Testing Orchestration (STO) Active
Service Reliability Management (SRM) Active
Active
Chaos Engineering Active
Cloud Cost Management (CCM) Active
Continuous Delivery - Next Generation (CDNG) Active
Continuous Error Tracking (CET) Active
Continuous Integration Enterprise(CIE) - Cloud Builds Active
Continuous Integration Enterprise(CIE) - Linux Cloud Builds Active
Continuous Integration Enterprise(CIE) - Self Hosted Runners Active
Continuous Integration Enterprise(CIE) - Windows Cloud Builds Active
Custom Dashboards Active
Feature Flags (FF) Active
Infrastructure as Code Management (IaCM) Active
Internal Developer Portal (IDP) Active
Security Testing Orchestration (STO) Active
Service Reliability Management (SRM) Active

Latest Harness outages and incidents.

View the latest incidents for Harness and check for official updates:

Updates:

  • Time: Oct. 17, 2023, 5:40 p.m.
    Status: Postmortem
    Update: ## Overview Mac Builds started failing during initialize step due to a timeout on trying to connect to the mac cloud provider. ‌ ## Timeline \(PST\) | **Time** | **Event** | | --- | --- | | 12:08 PM IST OCT 16 2023 | Engineering got message from a customer about Mac Builds failing via Zendesk. | | 12:35 PM IST OCT 16 2023 | Internal FireHydrant Incident created to check the failed pipelines for customers. | | 12:40 PM IST OCT 16 2023 | Bounced dlite deployment with config changes to remove mac pool in prod2 environment. | ‌ ## Resolution Bounced dlite deployment with config changes to remove mac pool in the impacted environment. ‌ ## Affected Users This impacted the customers in the prod2 environment using the hosted mac-builds. ‌ ## RCA As part of a prior incident we added a set of additional NAT IP’s. These NAT IP’s were not whitelisted in another project which is used to initialize the MAC VMs. The whitelisting was only required for the project which is used to initialize MAC builds because we use a different cloud provider for MAC when compared to linux/windows.
  • Time: Oct. 16, 2023, 4:50 p.m.
    Status: Resolved
    Update: The issue was resolved after dlite pod bounce and the hosted mac pipelines are running smoothly.
  • Time: Oct. 16, 2023, 4:48 p.m.
    Status: Investigating
    Update: Hosted Mac builds on Pod2 have failed between 11:00 PM PT and 12:00AM PT on October 15, 2023. This is due to a recent code change that has since been reverted. We will provide a Post Mortem with additional details.

Updates:

  • Time: Oct. 16, 2023, 8:26 p.m.
    Status: Postmortem
    Update: # Overview A few customers using the production environment \(Prod2\) reported encountering a "401 - Failed to fetch error" when attempting to access the Harness User Interface \(UI\). Notably, these customers observed that they could successfully log in and access the Harness platform using an incognito window. ## Timeline \(PST\) | **Time** | **Event** | | --- | --- | | 7:02 AM | Incident reported by customers | | 7:10 AM | The team executed a rollback of the recent deployment in the Prod2 environment, resulting in the successful resolution of the incident. | | 7:11 AM | Monitoring | | 7:41 AM | The issue has been confirmed as resolved | ## Resolution We initiated a rollback procedure, reverting the deployment from 810xx to 809xx within the Prod2 environment. ## Affected Users Users in Prod2 whose tokens had expired over the weekend. ## RCA Users encountered the "Failed to fetch: 401" error due to their session tokens expiring, leading to a 401 Unauthorized response from the Gateway. While typically, this should have redirected users to the login page, they remained on the same page because the 401 response was not handled by the UI with the recent deployment. We mitigated the incident by rolling back to the previously deployed version in the Prod2 environment. ## Action Items * We will ensure that any reverts are isolated and not combined with additional changes in the same Pull Request \(PR\) to prevent similar issues. * We will enhance our UI Automation by incorporating a critical test case to confirm that all 401 errors consistently redirect users to the login page.
  • Time: Oct. 16, 2023, 2:41 p.m.
    Status: Resolved
    Update: The issue has been confirmed as resolved, we are working to deliver the root cause analysis and will post this as soon as it is available.
  • Time: Oct. 16, 2023, 2:11 p.m.
    Status: Monitoring
    Update: We are continuing to monitor for any further issues.
  • Time: Oct. 16, 2023, 2:10 p.m.
    Status: Monitoring
    Update: The rollback of the deployment is complete, we are now actively working to determine the root cause of the issue of the issue, and monitoring this closely for any additional issues.
  • Time: Oct. 16, 2023, 2:02 p.m.
    Status: Investigating
    Update: Some of our users are unable to log in to Harness and observing error "401 Failed to retrieve license information" - We are performing a rollback now to mitigate the issue and will update as soon as this is completed.

Updates:

  • Time: Oct. 19, 2023, 1 p.m.
    Status: Postmortem
    Update: ## Overview TI service received burst cleanup calls at two different times and the cleanup procedure created a huge load on timescale DB used by TI service. The CPU of timescale was running at 100% due to which ping to timescale was failing \(TI service readiness probe\). Since TI service readiness probe was failing, Kubernetes marked the service pods unavailable. ## Impact CI pipelines executions which upload test reports and were started during the below time window may have been impacted. Times: \(2:15 AM - 2:50 AM and 4:45 AM to 5:15 AM\) PT ## Resolution Cleanup requested that were queued got completed and system returned to operational ## Timeline | **Time** | **Event** | **Notes** | | --- | --- | --- | | 10/16/2023 02:16 AM | TI service received burst cleanup requests | | | 4:44 AM | TI service received burst cleanup requests | | | 6:28 AM | TI service 503 error reported | | | 7:45 AM | Discovered that timescale was running at 100% | | | 8:00 AM | Suspected that cleanup jobs were hogging the timescale CPU Platform team confirmed that data deletion job was run due to a bug | Still going through TI service logs to find any other suspects | | 8:30 AM | Correlated and confirmed that the cleanup led to high timescale CPU | Both the readiness probe failures times coincided with the cleanup job requests | | 9:00 AM | Incident was resolved | Action items were discussed for Engg | ‌ ## RCA Timescale was not responding to pings which is the readiness probe for TI service. Since TI service readiness probe was unresponsive, Kubernetes marked the service pods unavailable. Timescale DB was running at 100% CPU due to high number of procedure calls made from TI service for periodic cleanup. The DB was already under high utilization due to production workload since all the test reports are stored in this DB. The CG Manager code which runs multiple threads in parallel for periodic cleanup ran on a weekday \(supposed to run on weekends\) due to a bug which sends cleanup events to all services. CI Manager picked up these events and sent burst cleanup API calls to TI service.
  • Time: Oct. 19, 2023, 12:57 p.m.
    Status: Resolved
    Update: This incident is resolved. Adding details in post mortem.
  • Time: Oct. 19, 2023, 12:57 p.m.
    Status: Investigating
    Update: CI pipelines executions which upload test reports are impacted due to DB load.

Updates:

  • Time: Oct. 24, 2023, 5:25 p.m.
    Status: Postmortem
    Update: This incident is the second instance of this incident [https://status.harness.io/incidents/jbvd0pd0qd2m](https://status.harness.io/incidents/jbvd0pd0qd2m). Please find the RCA here.
  • Time: Oct. 12, 2023, 7:15 p.m.
    Status: Resolved
    Update: This incident has been resolved. We have cleared the failed/timed-out delegate task queue. All customers should be unblocked at this time.
  • Time: Oct. 12, 2023, 7:11 p.m.
    Status: Investigating
    Update: Harness Pipelines executions are running into issue with rate limit exceeded. We are currently investigating this issue.

Updates:

  • Time: Oct. 12, 2023, 12:06 a.m.
    Status: Postmortem
    Update: # Overview We received a report from one of our customers about issues with their pipeline executions in our Prod-2 cluster. Details about the incident are below. ## Timeline \(PST\) | **Time** | **Event** | | --- | --- | | 7:49 AM | One customer reported issue with their pipeline executions | | 8:45 AM | Engineering got engaged in the investigation | | 9:40 AM | Issue identified and mitigated \(details below\) | ## Resolution We deleted the expired Delegate tasks from the database to unblock the customers. ## Affected accounts A total of seven customers were impacted because of this incident. We will take 60 mins of partial downtime for our CD, CDNG, STO and CIE - Self-Hosted Runners. ## RCA We identified the problem with a background job that periodically cleans up expired tasks. The background job hit a latent bug where it was iterating over the same set of tasks and could not delete them. The issue surfaced due to increased database query latency due to a scheduled database upgrade. We mitigated the incident by manually cleaning up the expired tasks from the database. ## Action Items * Enhance our alerting for this scenario so we can catch such issues early. * Improve the background cleanup job to be resilient to db latencies.
  • Time: Oct. 11, 2023, 5:27 p.m.
    Status: Resolved
    Update: This incident has been resolved.
  • Time: Oct. 11, 2023, 4:48 p.m.
    Status: Monitoring
    Update: We have cleared the failed/timed-out delegate task queue, and have restored the iterator that processes the queue to operational status and are monitoring the situation closely and will post any additional updates here. All customers should be unblocked at this time.
  • Time: Oct. 11, 2023, 4:35 p.m.
    Status: Identified
    Update: Harness has identified an issue with an iterator service that cleans up failed and/or timed-out delegate tasks, which is causing the queue to be backed up for some accounts at the threshold of 5000 triggering a rate limit message. Currently, we are clearing the queue of these failed tasks to restore operational capabilities and unblock any affected customers. We have identified the root cause and are working to unblock customers now.
  • Time: Oct. 11, 2023, 4:04 p.m.
    Status: Investigating
    Update: We are continuing to investigate this issue.
  • Time: Oct. 11, 2023, 3:57 p.m.
    Status: Investigating
    Update: Harness Pipelines executions are running into issue with rate limit exceeded. We are currently investigating this issue.

Check the status of similar companies and alternatives to Harness

UiPath
UiPath

Systems Active

Scale AI
Scale AI

Systems Active

Notion
Notion

Systems Active

Brandwatch
Brandwatch

Systems Active

Olive AI
Olive AI

Systems Active

Sisense
Sisense

Systems Active

HeyJobs
HeyJobs

Systems Active

Joveo
Joveo

Systems Active

Seamless AI
Seamless AI

Systems Active

hireEZ
hireEZ

Systems Active

Alchemy
Alchemy

Systems Active

Frequently Asked Questions - Harness

Is there a Harness outage?
The current status of Harness is: Systems Active
Where can I find the official status page of Harness?
The official status page for Harness is here
How can I get notified if Harness is down or experiencing an outage?
To get notified of any status changes to Harness, simply sign up to OutLogger's free monitoring service. OutLogger checks the official status of Harness every few minutes and will notify you of any changes. You can veiw the status of all your cloud vendors in one dashboard. Sign up here
What does Harness do?
Harness is a software delivery platform that enables engineers and DevOps to build, test, deploy, and verify software as needed.