Company Logo

Is there an Harness outage?

Harness status: Systems Active

Last checked: 2 minutes ago

Get notified about any outages, downtime or incidents for Harness and 1800+ other cloud vendors. Monitor 10 companies, for free.

Subscribe for updates

Harness outages and incidents

Outage and incident data over the last 30 days for Harness.

There have been 3 outages or incidents for Harness in the last 30 days.

Severity Breakdown:

Tired of searching for status updates?

Join OutLogger to be notified when any of your vendors or the components you use experience an outage. It's completely free and takes less than 2 minutes!

Sign Up Now

Components and Services Monitored for Harness

Outlogger tracks the status of these components for Xero:

Service Reliability Management - Error Tracking FirstGen (fka OverOps) Active
Software Engineering Insights FirstGen (fka Propelo) Active
Chaos Engineering Active
Cloud Cost Management (CCM) Active
Continuous Delivery (CD) - FirstGen - EOS Active
Continuous Delivery - Next Generation (CDNG) Active
Continuous Error Tracking (CET) Active
Continuous Integration Enterprise(CIE) - Cloud Builds Active
Continuous Integration Enterprise(CIE) - Linux Cloud Builds Active
Continuous Integration Enterprise(CIE) - Self Hosted Runners Active
Continuous Integration Enterprise(CIE) - Windows Cloud Builds Active
Custom Dashboards Active
Feature Flags (FF) Active
Infrastructure as Code Management (IaCM) Active
Internal Developer Portal (IDP) Active
Security Testing Orchestration (STO) Active
Service Reliability Management (SRM) Active
Software Engineering Insights (SEI) Active
Software Supply Chain Assurance (SSCA) Active
Chaos Engineering Active
Cloud Cost Management (CCM) Active
Continuous Delivery (CD) - FirstGen - EOS Active
Continuous Delivery - Next Generation (CDNG) Active
Continuous Error Tracking (CET) Active
Continuous Integration Enterprise(CIE) - Cloud Builds Active
Continuous Integration Enterprise(CIE) - Linux Cloud Builds Active
Continuous Integration Enterprise(CIE) - Self Hosted Runners Active
Continuous Integration Enterprise(CIE) - Windows Cloud Builds Active
Custom Dashboards Active
Feature Flags (FF) Active
Infrastructure as Code Management (IaCM) Active
Internal Developer Portal (IDP) Active
Security Testing Orchestration (STO) Active
Service Reliability Management (SRM) Active
Software Engineering Insights (SEI) Active
Software Supply Chain Assurance (SSCA) Active
Chaos Engineering Active
Cloud Cost Management (CCM) Active
Continuous Delivery (CD) - FirstGen - EOS Active
Continuous Delivery - Next Generation (CDNG) Active
Continuous Error Tracking (CET) Active
Continuous Integration Enterprise(CIE) - Cloud Builds Active
Continuous Integration Enterprise(CIE) - Linux Cloud Builds Active
Continuous Integration Enterprise(CIE) - Self Hosted Runners Active
Continuous Integration Enterprise(CIE) - Windows Cloud Builds Active
Custom Dashboards Active
Feature Flags (FF) Active
Infrastructure as Code Management (IaCM) Active
Internal Developer Portal (IDP) Active
Security Testing Orchestration (STO) Active
Service Reliability Management (SRM) Active
Software Supply Chain Assurance (SSCA) Active
Chaos Engineering Active
Cloud Cost Management (CCM) Active
Continuous Delivery - Next Generation (CDNG) Active
Continuous Error Tracking (CET) Active
Continuous Integration Enterprise(CIE) - Cloud Builds Active
Continuous Integration Enterprise(CIE) - Linux Cloud Builds Active
Continuous Integration Enterprise(CIE) - Self Hosted Runners Active
Continuous Integration Enterprise(CIE) - Windows Cloud Builds Active
Custom Dashboards Active
Feature Flags (FF) Active
Infrastructure as Code Management (IaCM) Active
Internal Developer Portal (IDP) Active
Security Testing Orchestration (STO) Active
Service Reliability Management (SRM) Active
Chaos Engineering Active
Cloud Cost Management (CCM) Active
Continuous Delivery - Next Generation (CDNG) Active
Continuous Error Tracking (CET) Active
Continuous Integration Enterprise(CIE) - Cloud Builds Active
Continuous Integration Enterprise(CIE) - Linux Cloud Builds Active
Continuous Integration Enterprise(CIE) - Self Hosted Runners Active
Continuous Integration Enterprise(CIE) - Windows Cloud Builds Active
Custom Dashboards Active
Feature Flags (FF) Active
Infrastructure as Code Management (IaCM) Active
Internal Developer Portal (IDP) Active
Security Testing Orchestration (STO) Active
Service Reliability Management (SRM) Active
Component Status
Service Reliability Management - Error Tracking FirstGen (fka OverOps) Active
Software Engineering Insights FirstGen (fka Propelo) Active
Active
Chaos Engineering Active
Cloud Cost Management (CCM) Active
Continuous Delivery (CD) - FirstGen - EOS Active
Continuous Delivery - Next Generation (CDNG) Active
Continuous Error Tracking (CET) Active
Continuous Integration Enterprise(CIE) - Cloud Builds Active
Continuous Integration Enterprise(CIE) - Linux Cloud Builds Active
Continuous Integration Enterprise(CIE) - Self Hosted Runners Active
Continuous Integration Enterprise(CIE) - Windows Cloud Builds Active
Custom Dashboards Active
Feature Flags (FF) Active
Infrastructure as Code Management (IaCM) Active
Internal Developer Portal (IDP) Active
Security Testing Orchestration (STO) Active
Service Reliability Management (SRM) Active
Software Engineering Insights (SEI) Active
Software Supply Chain Assurance (SSCA) Active
Active
Chaos Engineering Active
Cloud Cost Management (CCM) Active
Continuous Delivery (CD) - FirstGen - EOS Active
Continuous Delivery - Next Generation (CDNG) Active
Continuous Error Tracking (CET) Active
Continuous Integration Enterprise(CIE) - Cloud Builds Active
Continuous Integration Enterprise(CIE) - Linux Cloud Builds Active
Continuous Integration Enterprise(CIE) - Self Hosted Runners Active
Continuous Integration Enterprise(CIE) - Windows Cloud Builds Active
Custom Dashboards Active
Feature Flags (FF) Active
Infrastructure as Code Management (IaCM) Active
Internal Developer Portal (IDP) Active
Security Testing Orchestration (STO) Active
Service Reliability Management (SRM) Active
Software Engineering Insights (SEI) Active
Software Supply Chain Assurance (SSCA) Active
Active
Chaos Engineering Active
Cloud Cost Management (CCM) Active
Continuous Delivery (CD) - FirstGen - EOS Active
Continuous Delivery - Next Generation (CDNG) Active
Continuous Error Tracking (CET) Active
Continuous Integration Enterprise(CIE) - Cloud Builds Active
Continuous Integration Enterprise(CIE) - Linux Cloud Builds Active
Continuous Integration Enterprise(CIE) - Self Hosted Runners Active
Continuous Integration Enterprise(CIE) - Windows Cloud Builds Active
Custom Dashboards Active
Feature Flags (FF) Active
Infrastructure as Code Management (IaCM) Active
Internal Developer Portal (IDP) Active
Security Testing Orchestration (STO) Active
Service Reliability Management (SRM) Active
Software Supply Chain Assurance (SSCA) Active
Active
Chaos Engineering Active
Cloud Cost Management (CCM) Active
Continuous Delivery - Next Generation (CDNG) Active
Continuous Error Tracking (CET) Active
Continuous Integration Enterprise(CIE) - Cloud Builds Active
Continuous Integration Enterprise(CIE) - Linux Cloud Builds Active
Continuous Integration Enterprise(CIE) - Self Hosted Runners Active
Continuous Integration Enterprise(CIE) - Windows Cloud Builds Active
Custom Dashboards Active
Feature Flags (FF) Active
Infrastructure as Code Management (IaCM) Active
Internal Developer Portal (IDP) Active
Security Testing Orchestration (STO) Active
Service Reliability Management (SRM) Active
Active
Chaos Engineering Active
Cloud Cost Management (CCM) Active
Continuous Delivery - Next Generation (CDNG) Active
Continuous Error Tracking (CET) Active
Continuous Integration Enterprise(CIE) - Cloud Builds Active
Continuous Integration Enterprise(CIE) - Linux Cloud Builds Active
Continuous Integration Enterprise(CIE) - Self Hosted Runners Active
Continuous Integration Enterprise(CIE) - Windows Cloud Builds Active
Custom Dashboards Active
Feature Flags (FF) Active
Infrastructure as Code Management (IaCM) Active
Internal Developer Portal (IDP) Active
Security Testing Orchestration (STO) Active
Service Reliability Management (SRM) Active

Latest Harness outages and incidents.

View the latest incidents for Harness and check for official updates:

Updates:

  • Time: March 7, 2024, 4:13 a.m.
    Status: Postmortem
    Update: #### Overview We experienced a service disruption in our production environment, specifically impacting the Redis memory usage in our freemium offering. #### What was the issue? The core of the issue was the Redis memory in prod2 \(freemium\) reaching near full capacity. This led to operational failures in dependent services, primarily due to Redis running out of memory \(OOM\). The root cause analysis identified a significant increase in memory consumption by one of the Redis streams \(named freemium:streams:DEBEZIUM\_idpMongo.idp-harness.backstageCatalog\), which started consuming an unusually high amount of memory \(~6 GB\) following the latest release of the idp-service \(version 1.6.0\). Furthermore, pipeline service-related caches were also found to be consuming higher memory than anticipated. #### Timeline | **Time \(In IST\)** | **Event** | | --- | --- | | 1st March 11.45 PM | STO uptime monitoring failed with Redis OOM | | 1st March 11.53 PM | FH Triggered | | 1st March 11.54 PM | Pipeline failures reported due to Redis OOM | | 2nd March 2.02 AM | Redis Events framework database memory was increased by 25% | | 2nd March 2.03 AM | Issue resolved after memory increase | | 3rd March 12.36 AM | debezium service is bounced with updated config which disabled IDP mongo collections streaming | | 3rd March 1.11 AM | Stream “freemium:streams:DEBEZIUM\_idpMongo.idp-harness.backstageCatalog“ was trimmed in prod2 to reclaim the memory | #### Resolution The immediate resolution involved increasing the memory allocated to the Redis events framework database by 25% and disabling the stream flow that was consuming excessive memory. This action effectively resolved the incident within two hours. #### Action Items Following this incident, we are taking several steps to prevent recurrence: * Implement rigorous validation of changes with respect to Redis memory usage in both QA and PROD environments with each release. * Investigate and rectify the increased message size issue in the `backstageCatalog` stream when published to Redis. * Establish alerts for individual streams to promptly notify the relevant teams. * The Pipeline team will conduct a thorough review of streams related to their services, including the `webhook_events_stream` and `git_push_event_stream`.
  • Time: March 1, 2024, 8:31 p.m.
    Status: Resolved
    Update: We can confirm normal operation. We will continue to monitor and ensure stability.
  • Time: March 1, 2024, 8:19 p.m.
    Status: Monitoring
    Update: Service issues have been addressed and normal operations has been resumed. We are monitoring the service to ensure normal performance continues.
  • Time: March 1, 2024, 8:15 p.m.
    Status: Identified
    Update: We are continuing to work on a fix for this issue.
  • Time: March 1, 2024, 7:14 p.m.
    Status: Identified
    Update: The resource constraint has been identified and we are working to mitigate the situation.
  • Time: March 1, 2024, 6:40 p.m.
    Status: Investigating
    Update: We are debugging an incident that is potentially impacting pipelines due to a core db component. The issue started at 10:05 AM PT and team is currently trying to root cause.

Updates:

  • Time: Feb. 28, 2024, 1:39 p.m.
    Status: Postmortem
    Update: **Overview :** CI Pipeline getting failed post rolling out new version -1.15.3 **What was issue:** There was a change included in CI manager 1.15.3 build related to a code rewrite. Due to this change we had a scenario where there was backward incompatibility between two builds. Pipelines where plan creation happened on earlier version and execution on newer, failed. ‌ **Timeline**: | **Time** | **Event** | | --- | --- | | Feb 27 2024 3:09:33 PM IST | CI Manager new version was deployed to prod | | Feb 27 2024 3:12 PM IST | CI Internal Sanity got failed, Internal incident created. | | Feb 27 2024 3:22 PM IST | CI Manager reverted to older version | **Resolution:** We rolled back the release immediately as Internal Sanity failed. **RCA & Action Items:** Adding automated/manual check for deployment transition pipeline or execution such that we catch these issues ahead of time.
  • Time: Feb. 27, 2024, 10:18 a.m.
    Status: Resolved
    Update: Issue was with latest release, reverted to old release.
  • Time: Feb. 27, 2024, 10:17 a.m.
    Status: Investigating
    Update: We are currently investigating this issue.

Updates:

  • Time: Feb. 26, 2024, 2:37 p.m.
    Status: Postmortem
    Update: **Overview:** Hosted CI Builds on MacOS are failing to initialise in all environments. **What was the issue?** * To fix the registry issue which was recurring on 10 Feb 2024. [https://manage.statuspage.io/pages/zmnt6tkys0q0/incidents/sd82v3lmcqqd#postmortem](https://manage.statuspage.io/pages/zmnt6tkys0q0/incidents/sd82v3lmcqqd#postmortem)  we attempted a release for Anka controller. * Controller would be required to initialise the VM’s and manage the state of the VM’s, it being down means the Mac pipelines were not functional. * Post debugging we identified that there were network configurations issue that had to be re-configured to ensure the controller was accessible. **Timeline**: | **Time** | **Event** | | --- | --- | | 22nd Feb 2024 8:13 AM IST | We started deployment for Controller to fix issue | | 22nd Feb 2024 8:24 AM IST | We noticed Controller was not coming up, Hence started revert of release. | | 22nd Feb 2024 8:30 AM IST | Revert did not work here, hence Internally incident was created and an investigation started. | | 22nd Feb 2024 12:11 PM IST | Dlite deployment - bounce done with new network changes on all cluster. | **Resolution:** We resolved network configurations related issues for controller, further it was accessible. **RCA & Action Items:** As part of the improvements, we will be moving this to a high availability setup. We will also be updating the alerting and monitoring around this workflow to capture such issues immediately.
  • Time: Feb. 22, 2024, 6:44 a.m.
    Status: Resolved
    Update: This incident has been resolved.
  • Time: Feb. 22, 2024, 6:39 a.m.
    Status: Monitoring
    Update: A fix has been implemented and we are monitoring the environment.
  • Time: Feb. 22, 2024, 3:12 a.m.
    Status: Identified
    Update: We have identified the issue and are working on a fix now.
  • Time: Feb. 22, 2024, 3:03 a.m.
    Status: Investigating
    Update: We are currently investigating this issue

Updates:

  • Time: Feb. 12, 2024, 7 p.m.
    Status: Postmortem
    Update: ## Overview Hosted CI Builds on MacOS are failing to initialize in all environments. ## Timeline ## Resolution A server used for MacOs build farm orchestration caused the image repository to be unavailable. The server was made operational and the system restored. ## RCA & Action Items As part of the improvements, we will be moving this to a high availability setup. We will also be updating the alerting and monitoring around this workflow to capture such issues immediately.
  • Time: Feb. 12, 2024, 6:59 p.m.
    Status: Resolved
    Update: Systems are back to operational. Postmortem will contain details and identified action items.
  • Time: Feb. 12, 2024, 6:58 p.m.
    Status: Investigating
    Update: We are currently investigating this issue.

Updates:

  • Time: Feb. 7, 2024, 4:57 p.m.
    Status: Postmortem
    Update: **Incident Summary:** Due to an increase in traffic, there was a period of high latency experienced on the Feature Flag metrics service This was caused by the service not able to scale up quickly enough to handle the additional load automatically, and the service to become slow, and returning errors. Once identified by the team, the cloud engineer was able to manually scale up the service, and the service was restored **Timeline** | **Time \(UTC\)** | **Event** | | --- | --- | | 18:11 PM | Large number of requests seen coming through the network | | 18:14 PM | Service gets into a degraded state, returning an increase in errors, and latency | | 18:14 PM | On call engineer is alerted and begins investigation | | 18:24 PM | Service is manually scaled up to handle the load | | 18:24 PM | Development team begin RCA | | 18:41 PM | All requests return to normal operational behaviour | | 18:41 PM | Incident resolved | **Root Cause Analysis:** The incident originated from an increased rate of requests on the Prod 1 environment, causing the Feature Flag metrics service to get into a degraded state. While the service has auto-scaling capabilities in place, the sudden increase, and size of the increase resulted in the automated scaling to be inefficient, and manual intervention was required **Immediate Resolution:** To address the incident promptly, the team increased the resource capacity of the affected service, until the service was able to resume normal operations. **Preventive Measures:** To prevent similar incidents in the future while the team are addressing working on improvements, resources have been adjusted in the affected cluster to better handle sudden traffic spikes **Action Items:** We have identified a number of bottlenecks that resulted in the incident, and the development team are actively working on improvements
  • Time: Feb. 5, 2024, 7 p.m.
    Status: Resolved
    Update: The incident has been resolved. We will provide a postmortem once we have gathered all the details.
  • Time: Feb. 5, 2024, 6:55 p.m.
    Status: Investigating
    Update: We are currently investigating unusually high latency on the Feature Flag metrics service, due to a high volume of traffic

Check the status of similar companies and alternatives to Harness

UiPath
UiPath

Systems Active

Scale AI
Scale AI

Systems Active

Notion
Notion

Systems Active

Brandwatch
Brandwatch

Systems Active

Olive AI
Olive AI

Systems Active

Sisense
Sisense

Systems Active

HeyJobs
HeyJobs

Systems Active

Joveo
Joveo

Systems Active

Seamless AI
Seamless AI

Systems Active

hireEZ
hireEZ

Systems Active

Alchemy
Alchemy

Systems Active

Frequently Asked Questions - Harness

Is there a Harness outage?
The current status of Harness is: Systems Active
Where can I find the official status page of Harness?
The official status page for Harness is here
How can I get notified if Harness is down or experiencing an outage?
To get notified of any status changes to Harness, simply sign up to OutLogger's free monitoring service. OutLogger checks the official status of Harness every few minutes and will notify you of any changes. You can veiw the status of all your cloud vendors in one dashboard. Sign up here
What does Harness do?
Harness is a software delivery platform that enables engineers and DevOps to build, test, deploy, and verify software as needed.