Company Logo

Is there an CircleCI outage?

CircleCI status: Systems Active

Last checked: a minute ago

Get notified about any outages, downtime or incidents for CircleCI and 1800+ other cloud vendors. Monitor 10 companies, for free.

Subscribe for updates

CircleCI outages and incidents

Outage and incident data over the last 30 days for CircleCI.

There have been 6 outages or incidents for CircleCI in the last 30 days.

Severity Breakdown:

Tired of searching for status updates?

Join OutLogger to be notified when any of your vendors or the components you use experience an outage. It's completely free and takes less than 2 minutes!

Sign Up Now

Components and Services Monitored for CircleCI

Outlogger tracks the status of these components for Xero:

Artifacts Active
Billing & Account Active
CircleCI Insights Active
CircleCI Releases Active
CircleCI UI Active
CircleCI Webhooks Active
Docker Jobs Active
Machine Jobs Active
macOS Jobs Active
Notifications & Status Updates Active
Pipelines & Workflows Active
Runner Active
Windows Jobs Active
AWS Active
Google Cloud Platform Google Cloud DNS Active
Google Cloud Platform Google Cloud Networking Active
Google Cloud Platform Google Cloud Storage Active
Google Cloud Platform Google Compute Engine Active
mailgun API Active
mailgun Outbound Delivery Active
mailgun SMTP Active
OpenAI Active
Atlassian Bitbucket API Active
Atlassian Bitbucket Source downloads Active
Atlassian Bitbucket SSH Active
Atlassian Bitbucket Webhooks Active
Docker Authentication Active
Docker Hub Active
Docker Registry Active
GitHub API Requests Active
GitHub Git Operations Active
GitHub Packages Active
GitHub Pull Requests Active
GitHub Webhooks Active
GitLab Active
Component Status
Artifacts Active
Billing & Account Active
CircleCI Insights Active
CircleCI Releases Active
CircleCI UI Active
CircleCI Webhooks Active
Docker Jobs Active
Machine Jobs Active
macOS Jobs Active
Notifications & Status Updates Active
Pipelines & Workflows Active
Runner Active
Windows Jobs Active
Active
AWS Active
Google Cloud Platform Google Cloud DNS Active
Google Cloud Platform Google Cloud Networking Active
Google Cloud Platform Google Cloud Storage Active
Google Cloud Platform Google Compute Engine Active
mailgun API Active
mailgun Outbound Delivery Active
mailgun SMTP Active
OpenAI Active
Active
Atlassian Bitbucket API Active
Atlassian Bitbucket Source downloads Active
Atlassian Bitbucket SSH Active
Atlassian Bitbucket Webhooks Active
Docker Authentication Active
Docker Hub Active
Docker Registry Active
GitHub API Requests Active
GitHub Git Operations Active
GitHub Packages Active
GitHub Pull Requests Active
GitHub Webhooks Active
GitLab Active

Latest CircleCI outages and incidents.

View the latest incidents for CircleCI and check for official updates:

Updates:

  • Time: Oct. 28, 2024, 3:50 p.m.
    Status: Resolved
    Update: The incident has been resolved. Thanks for your patience.
  • Time: Oct. 28, 2024, 3:26 p.m.
    Status: Monitoring
    Update: Jobs are working again. If you had any jobs showing failures you will have to re-run. We will continue monitoring.
  • Time: Oct. 28, 2024, 2:58 p.m.
    Status: Investigating
    Update: Some jobs are failing to start, and some jobs are having infrastructure failures. We are looking into it.

Updates:

  • Time: Nov. 15, 2024, 1:23 p.m.
    Status: Postmortem
    Update: ## Summary: On October 22, 2024, from 14:45 to 15:52 and again from 17:41 to 18:22 UTC, CircleCI customers experienced failures on new job submissions as well as failures on jobs that were in progress. A sudden increase in the number of tasks completing simultaneously and requests to upload artifacts from jobs overloaded the service responsible for managing job output. On October 28, 2024, from 13:27 to 14:13 and from 14:58 to 15:50, CircleCI customers experienced a recurrence of these effects due to a similar cause. During these sets of incidents, customers would have experienced their jobs failing to start with an infrastructure failure. Jobs that were already in progress also failed with an infrastructure failure. We want to thank our customers for your patience and understanding as we worked to resolve these incidents. The original status pages for the incidents on October 22 can be found [here](https://status.circleci.com/incidents/6yjv79g764yc) and [here](https://status.circleci.com/incidents/0crxbhkflndc). The status pages for incidents on October 28 can be found [here](https://status.circleci.com/incidents/xk37ycndxbhc) and [here](https://status.circleci.com/incidents/8ktdwlsf2lm8). ## What Happened: \(All times UTC\) On October 22, 2024, at 14:45 there was a sudden increase of customer tasks completing at the same time within CircleCI. In order to record each of these task end events, including the amount of storage the task used, the system that manages task state \(distributor\) made calls to our internal API gateway, which subsequently queried the system responsible for storing job output \(output service\). At this point, output service became overwhelmed with requests; although some requests were handled successfully, the vast majority were delayed before finally receiving a `499 Client Closed Request` error response. ![](https://global.discourse-cdn.com/circleci/original/3X/2/b/2b68322aaf27124eb5ae63a15bc0f8f2118c3f7b.png) `Distributor task end calls to the internal API gateway` Additionally, at 14:50, output service received an influx of artifact upload requests, further straining resources in the service. An incident was officially declared at 14:57. Output service was scaled horizontally at 15:16 to handle the additional load it was receiving. Internal health checks began to recover at 15:25, and we continued to monitor output service until incoming requests returned to normal levels. The incident was resolved at 15:52 and we kept output service horizontally scaled. At 17:41, output service received another sharp increase in requests to upload artifacts and was unable to keep up with the additional load, causing jobs to fail again. An incident was declared at 17:57. Because output service was still horizontally scaled from the initial incident, it automatically recovered by 18:00. As a proactive measure, we further scaled output service horizontally at 18:02. We continued to monitor our systems until the incident was resolved at 18:22. Following incident resolution, we continued our investigation and uncovered on October 25 that our internal API gateway was configured with low values for the maximum number of connections allowed to each of the services that experienced increased load on October 22. We immediately increased these values so that the gateway could handle increased volume of task end events moving forward. Despite these improvements, on October 28, 2024, at 13:27, customer jobs started to fail in the same way as they previously did on October 22. An incident was officially declared at 13:38. By 13:48, the system automatically recovered without any intervention and the incident was resolved at 14:13. We continued to investigate the root cause of the delays and failures, but at 14:45 customer jobs started to fail again in the same way. We declared another incident at 14:50. In order to reduce the load on output service, we removed the retry logic when requesting storage used per task from output service. This allowed tasks to complete even if storage used could not be retrieved \(to the customer’s benefit\). Additionally, we scaled distributor horizontally at 15:19 in order to handle the increased load. At 15:21, our systems began to recover. We continued to monitor and resolved the incident at 15:51. We returned to our investigation into the root cause of this recurring behavior and discovered that there was an additional client in distributor that was configured with a low value for maximum number of connections to our internal API gateway. We increased this value at 17:33. ## Future Prevention and Process Improvement: Following the remediation on October 28, we conducted an audit of **all** of the HTTP clients in the execution environment and proactively increased those that were similarly configured to ones in the internal API gateway and distributor. Additionally, we identified a gap in observability with these HTTP clients that prevented us from identifying the root cause of these sets of incidents sooner. We immediately added additional observability to all of the clients in order to enable better alerting if connections pools were to become exhausted again in the future.
  • Time: Oct. 28, 2024, 2:13 p.m.
    Status: Resolved
    Update: The incident has been resolved. Thanks for your patience.
  • Time: Oct. 28, 2024, 2:02 p.m.
    Status: Monitoring
    Update: Jobs are working again. If you had any jobs showing failures you will have to re-run. We will continue monitoring.
  • Time: Oct. 28, 2024, 1:44 p.m.
    Status: Investigating
    Update: Some jobs are failing to start, and some jobs are having infrastructure failures. We are looking into it.

Updates:

  • Time: Oct. 24, 2024, 6:30 p.m.
    Status: Resolved
    Update: This incident has been resolved.
  • Time: Oct. 24, 2024, 6:19 p.m.
    Status: Monitoring
    Update: We are seeing recovery and will continue to monitor.
  • Time: Oct. 24, 2024, 6:04 p.m.
    Status: Identified
    Update: Wait times continue to decrease. We are monitoring the fix.
  • Time: Oct. 24, 2024, 5:41 p.m.
    Status: Identified
    Update: MacOS job starts delayed for M2 Pro medium resource class. We've identified the issue and we are working to resolve it. We will provide more updates as information becomes available and we appreciate your continued patience.
  • Time: Oct. 24, 2024, 5:38 p.m.
    Status: Identified
    Update: The issue has been identified and a fix is being implemented.

Updates:

  • Time: Oct. 24, 2024, 11:27 a.m.
    Status: Resolved
    Update: This incident has been resolved.
  • Time: Oct. 24, 2024, 11:06 a.m.
    Status: Monitoring
    Update: The plans and usage pages are now accessible and is functioning normally.
  • Time: Oct. 24, 2024, 11 a.m.
    Status: Identified
    Update: We have identified the cause of the issue and have begun remediating it. We appreciate your patience whilst we work through the issue.
  • Time: Oct. 24, 2024, 10:49 a.m.
    Status: Investigating
    Update: We're continuing to investigate this issue. Thank you for your patience.
  • Time: Oct. 24, 2024, 10:33 a.m.
    Status: Investigating
    Update: Users are unable to view the plans or usage pages. We're investigating this issue.

Updates:

  • Time: Oct. 23, 2024, 12:30 p.m.
    Status: Resolved
    Update: During this incident, customers could not access the Runner Inventory page and experienced infrastructure failures for Runner jobs.

Check the status of similar companies and alternatives to CircleCI

Hudl
Hudl

Systems Active

OutSystems
OutSystems

Systems Active

Postman
Postman

Systems Active

Mendix
Mendix

Systems Active

DigitalOcean
DigitalOcean

Issues Detected

Bandwidth
Bandwidth

Systems Active

DataRobot
DataRobot

Systems Active

Grafana Cloud
Grafana Cloud

Systems Active

SmartBear Software
SmartBear Software

Systems Active

Test IO
Test IO

Systems Active

Copado Solutions
Copado Solutions

Systems Active

LaunchDarkly
LaunchDarkly

Systems Active

Frequently Asked Questions - CircleCI

Is there a CircleCI outage?
The current status of CircleCI is: Systems Active
Where can I find the official status page of CircleCI?
The official status page for CircleCI is here
How can I get notified if CircleCI is down or experiencing an outage?
To get notified of any status changes to CircleCI, simply sign up to OutLogger's free monitoring service. OutLogger checks the official status of CircleCI every few minutes and will notify you of any changes. You can veiw the status of all your cloud vendors in one dashboard. Sign up here
What does CircleCI do?
Access top-notch CI/CD for any platform, on our cloud or your own infrastructure, at no cost.