Company Logo

Is there an Cronofy outage?

Cronofy status: Systems Active

Last checked: 7 minutes ago

Get notified about any outages, downtime or incidents for Cronofy and 1800+ other cloud vendors. Monitor 10 companies, for free.

Subscribe for updates

Cronofy outages and incidents

Outage and incident data over the last 30 days for Cronofy.

There have been 0 outages or incidents for Cronofy in the last 30 days.

Severity Breakdown:

Tired of searching for status updates?

Join OutLogger to be notified when any of your vendors or the components you use experience an outage. It's completely free and takes less than 2 minutes!

Sign Up Now

Components and Services Monitored for Cronofy

Outlogger tracks the status of these components for Xero:

API Active
Background Processing Active
Developer Dashboard Active
Scheduler Active
GoTo Active
Zoom Active
Apple Active
Google Active
Microsoft 365 Active
Outlook.com Active
Component Status
API Active
Background Processing Active
Developer Dashboard Active
Scheduler Active
Active
GoTo Active
Zoom Active
Active
Apple Active
Google Active
Microsoft 365 Active
Outlook.com Active

Latest Cronofy outages and incidents.

View the latest incidents for Cronofy and check for official updates:

Updates:

  • Time: Feb. 25, 2022, 9:45 a.m.
    Status: Postmortem
    Update: On Tuesday, February 22nd 2022 our US data center experienced 95 minutes of degraded performance between 15:45 and 17:20 UTC. This was caused by the primary PostgreSQL database hitting bandwidth limits and its performance being throttled as a result. This was caused or exacerbated by PostgreSQLs internal housekeeping working on two of our largest tables at the same time. To our customers this would have surfaced as interactions with the US Cronofy platform, i.e. using the website or API, being much slower than normal. For example, the 99th percentile of API response times is usually around 0.5 seconds and during this incident peaked around 14 seconds. We have upgraded the underlying instances of this database, broadly doubling capacity and putting us far from the limit we were hitting. ## Timeline _All times UTC on Tuesday, February 22nd 2022 and approximate for clarity._ **15:45** Our primary database in our US data center started showing signs of some performance degradation. **16:05** First alert received by the on-call engineer for a potential performance issue. Attempts to reduce load on the database through interventions such as temporarily disabling some of its background housekeeping processes. **16:45** Incident opened on our status page informing customers of degraded performance in the US data center. **17:00** Began provisioning more capacity for the primary database as a fallback plan if efforts continued to be unsuccessful. **17:10** New capacity available. **17:15** Failed over to fully take advantage of the new capacity by promoting the larger node to be the writer. **17:20** Performance had returned to normal levels in the US data center. **17:45** Decided we could close the incident. **18:00** Decided to lock in the capacity change and provisioned an additional reader node at the new size. **18:15** Removed the smaller nodes from the database cluster. ## Actions Whilst there was not an outage, this felt like a close call for us. This led to three key questions: * Why had we not foreseen this capacity issue? * Could the capacity issue have been prevented? * Why had we not resolved the issue sooner? ### Foreseeing the capacity issue We had recently performed a major version upgrade on this database, and in the following weeks monitored performance pretty closely. If there was a time we should have spotted a potential issue in the near future, this was such a time. We believe we may have focussed too heavily on CPU and memory metrics in our monitoring, and it was networking capacity that led to this degradation in performance. We will be reviewing our monitoring to set alerts that would have pointed us in the right direction sooner, and also lower priority alerts that would flag an upcoming capacity issue days or weeks in advance. ### Preventing the capacity issue As PostgreSQL internal housekeeping processes appeared to contribute significantly to the problem, we will be revisiting the configuration of these process and seeing if they can be altered to reduce the likelihood of such an impact in future. ### Resolving the issue sooner As this was a performance degradation rather than an outage, the scale of the problem was not clear. This led to the on-call engineer investigating the issue whilst performance degraded further without additional alerts being raised. We will be adding additional alerts relating to performance degradation in several subsystems to raise awareness of the impact of a problem to an on-call engineer. We are also updating our guidance on incident handling for the team to encourage switching to a more visible channel for communication sooner. We are also encouraging the escalation of alerts to involve other on-call engineers in the process, particularly when the cause is not immediately clear. ## Further questions? If you have any further questions, please contact us at [[email protected]](mailto:[email protected])
  • Time: Feb. 22, 2022, 5:58 p.m.
    Status: Resolved
    Update: Around 15:45 UTC our primary database in our US data center started showing signs of some performance degradation. We first received an alert at around 16:05 UTC as this problem grew more significant. We made attempts to reduce load on the database through interventions such as temporarily disabling some of its background housekeeping processes. Often giving such breathing room will allow a database to recover by itself. Around 16:45 UTC it appeared our efforts were not bearing fruit, and as the performance of our US data center was degraded from normal levels we opened an incident to make it clear we were aware of the situation. Around 17:00 UTC we decided to provision more capacity for the cluster in case it was necessary, this took around 10 minutes to come online. Whilst that was provisioning, we reduced the capacity of background workers temporarily to see if that would clear the problem by reducing the load. This was unsuccessful and so around 17:15 UTC we decided to failover to the new cluster capacity, after 5 minutes this had warmed and performance had returned to normal levels. There was a brief spike in errors from the US data center as a side effect of the failover, but otherwise the service was available throughout, albeit with degraded performance. We will be conducting a postmortem of this incident and will share our finding by the end of the week.
  • Time: Feb. 22, 2022, 5:19 p.m.
    Status: Identified
    Update: Our primary database is the source of the degraded performance, we have provisioned additional capacity to the cluster and failed over to make a new, larger node the primary one. Early signs are positive and we are monitoring the service.
  • Time: Feb. 22, 2022, 4:51 p.m.
    Status: Investigating
    Update: We are investigating degraded performance in our US data center.

Updates:

  • Time: Feb. 25, 2022, 9:45 a.m.
    Status: Postmortem
    Update: On Tuesday, February 22nd 2022 our US data center experienced 95 minutes of degraded performance between 15:45 and 17:20 UTC. This was caused by the primary PostgreSQL database hitting bandwidth limits and its performance being throttled as a result. This was caused or exacerbated by PostgreSQLs internal housekeeping working on two of our largest tables at the same time. To our customers this would have surfaced as interactions with the US Cronofy platform, i.e. using the website or API, being much slower than normal. For example, the 99th percentile of API response times is usually around 0.5 seconds and during this incident peaked around 14 seconds. We have upgraded the underlying instances of this database, broadly doubling capacity and putting us far from the limit we were hitting. ## Timeline _All times UTC on Tuesday, February 22nd 2022 and approximate for clarity._ **15:45** Our primary database in our US data center started showing signs of some performance degradation. **16:05** First alert received by the on-call engineer for a potential performance issue. Attempts to reduce load on the database through interventions such as temporarily disabling some of its background housekeeping processes. **16:45** Incident opened on our status page informing customers of degraded performance in the US data center. **17:00** Began provisioning more capacity for the primary database as a fallback plan if efforts continued to be unsuccessful. **17:10** New capacity available. **17:15** Failed over to fully take advantage of the new capacity by promoting the larger node to be the writer. **17:20** Performance had returned to normal levels in the US data center. **17:45** Decided we could close the incident. **18:00** Decided to lock in the capacity change and provisioned an additional reader node at the new size. **18:15** Removed the smaller nodes from the database cluster. ## Actions Whilst there was not an outage, this felt like a close call for us. This led to three key questions: * Why had we not foreseen this capacity issue? * Could the capacity issue have been prevented? * Why had we not resolved the issue sooner? ### Foreseeing the capacity issue We had recently performed a major version upgrade on this database, and in the following weeks monitored performance pretty closely. If there was a time we should have spotted a potential issue in the near future, this was such a time. We believe we may have focussed too heavily on CPU and memory metrics in our monitoring, and it was networking capacity that led to this degradation in performance. We will be reviewing our monitoring to set alerts that would have pointed us in the right direction sooner, and also lower priority alerts that would flag an upcoming capacity issue days or weeks in advance. ### Preventing the capacity issue As PostgreSQL internal housekeeping processes appeared to contribute significantly to the problem, we will be revisiting the configuration of these process and seeing if they can be altered to reduce the likelihood of such an impact in future. ### Resolving the issue sooner As this was a performance degradation rather than an outage, the scale of the problem was not clear. This led to the on-call engineer investigating the issue whilst performance degraded further without additional alerts being raised. We will be adding additional alerts relating to performance degradation in several subsystems to raise awareness of the impact of a problem to an on-call engineer. We are also updating our guidance on incident handling for the team to encourage switching to a more visible channel for communication sooner. We are also encouraging the escalation of alerts to involve other on-call engineers in the process, particularly when the cause is not immediately clear. ## Further questions? If you have any further questions, please contact us at [[email protected]](mailto:[email protected])
  • Time: Feb. 22, 2022, 5:58 p.m.
    Status: Resolved
    Update: Around 15:45 UTC our primary database in our US data center started showing signs of some performance degradation. We first received an alert at around 16:05 UTC as this problem grew more significant. We made attempts to reduce load on the database through interventions such as temporarily disabling some of its background housekeeping processes. Often giving such breathing room will allow a database to recover by itself. Around 16:45 UTC it appeared our efforts were not bearing fruit, and as the performance of our US data center was degraded from normal levels we opened an incident to make it clear we were aware of the situation. Around 17:00 UTC we decided to provision more capacity for the cluster in case it was necessary, this took around 10 minutes to come online. Whilst that was provisioning, we reduced the capacity of background workers temporarily to see if that would clear the problem by reducing the load. This was unsuccessful and so around 17:15 UTC we decided to failover to the new cluster capacity, after 5 minutes this had warmed and performance had returned to normal levels. There was a brief spike in errors from the US data center as a side effect of the failover, but otherwise the service was available throughout, albeit with degraded performance. We will be conducting a postmortem of this incident and will share our finding by the end of the week.
  • Time: Feb. 22, 2022, 5:19 p.m.
    Status: Identified
    Update: Our primary database is the source of the degraded performance, we have provisioned additional capacity to the cluster and failed over to make a new, larger node the primary one. Early signs are positive and we are monitoring the service.
  • Time: Feb. 22, 2022, 4:51 p.m.
    Status: Investigating
    Update: We are investigating degraded performance in our US data center.

Updates:

  • Time: Jan. 27, 2022, 5:34 p.m.
    Status: Resolved
    Update: At approximately 17:00 UTC we observed a much higher number of errors for Google calendar API calls than we would expect (mostly 503 Service Unavailable responses) across all of our data centers. There does not appear to have been a pattern to the accounts affected by this. We decided to open an incident about this at 17:10 UTC to inform of potential service degradation as it seemed like it could be a more persistent issue. Whilst opening this incident, errors when communicating with the Google calendar API returned to normal levels at around 17:12 UTC. Errors have remained at normal levels since that time and so we are resolving this incident.
  • Time: Jan. 27, 2022, 5:18 p.m.
    Status: Monitoring
    Update: Errors returned to usual levels at around 17:12 UTC, as the previous message was being sent. We are monitoring the situation.
  • Time: Jan. 27, 2022, 5:14 p.m.
    Status: Investigating
    Update: Since approximately 17:00 UTC we have seen a higher level of errors when communicating with Google calendars than we would normally expect across all of our data centers. We are monitoring the situation and taking any actions available to us to minimize the impact. Synchronization performance for Google calendars will be affected by this.

Updates:

  • Time: Jan. 27, 2022, 5:34 p.m.
    Status: Resolved
    Update: At approximately 17:00 UTC we observed a much higher number of errors for Google calendar API calls than we would expect (mostly 503 Service Unavailable responses) across all of our data centers. There does not appear to have been a pattern to the accounts affected by this. We decided to open an incident about this at 17:10 UTC to inform of potential service degradation as it seemed like it could be a more persistent issue. Whilst opening this incident, errors when communicating with the Google calendar API returned to normal levels at around 17:12 UTC. Errors have remained at normal levels since that time and so we are resolving this incident.
  • Time: Jan. 27, 2022, 5:18 p.m.
    Status: Monitoring
    Update: Errors returned to usual levels at around 17:12 UTC, as the previous message was being sent. We are monitoring the situation.
  • Time: Jan. 27, 2022, 5:14 p.m.
    Status: Investigating
    Update: Since approximately 17:00 UTC we have seen a higher level of errors when communicating with Google calendars than we would normally expect across all of our data centers. We are monitoring the situation and taking any actions available to us to minimize the impact. Synchronization performance for Google calendars will be affected by this.

Updates:

  • Time: Jan. 10, 2022, 4:12 p.m.
    Status: Resolved
    Update: Our Engineering team has resolved the Scheduler issue, and users can now log in again. Please get in touch with [email protected] if you have any further questions.
  • Time: Jan. 10, 2022, 3:52 p.m.
    Status: Identified
    Update: We are aware of an issue with the Scheduler, which is stopping users from logging in. Our Engineering team are investigating and aim to have a fix in place shortly.

Check the status of similar companies and alternatives to Cronofy

NetSuite
NetSuite

Systems Active

ZoomInfo
ZoomInfo

Systems Active

SPS Commerce
SPS Commerce

Systems Active

Miro
Miro

Systems Active

Field Nation
Field Nation

Systems Active

Outreach
Outreach

Systems Active

Own Company

Systems Active

Mindbody
Mindbody

Systems Active

TaskRabbit
TaskRabbit

Systems Active

Nextiva
Nextiva

Systems Active

6Sense

Systems Active

BigCommerce
BigCommerce

Systems Active

Frequently Asked Questions - Cronofy

Is there a Cronofy outage?
The current status of Cronofy is: Systems Active
Where can I find the official status page of Cronofy?
The official status page for Cronofy is here
How can I get notified if Cronofy is down or experiencing an outage?
To get notified of any status changes to Cronofy, simply sign up to OutLogger's free monitoring service. OutLogger checks the official status of Cronofy every few minutes and will notify you of any changes. You can veiw the status of all your cloud vendors in one dashboard. Sign up here
What does Cronofy do?
Cronofy offers scheduling technology that enables users to share their availability across various applications. It also provides enterprise-level scheduling tools, UI elements, and APIs.