Last checked: 4 minutes ago
Get notified about any outages, downtime or incidents for Fasterize and 1800+ other cloud vendors. Monitor 10 companies, for free.
Outage and incident data over the last 30 days for Fasterize.
Join OutLogger to be notified when any of your vendors or the components you use experience an outage. It's completely free and takes less than 2 minutes!
Sign Up NowOutlogger tracks the status of these components for Xero:
Component | Status |
---|---|
Acceleration | Active |
API | Active |
CDN | Active |
Collect | Active |
Dashboard | Active |
Logs delivery | Active |
Website | Active |
View the latest incidents for Fasterize and check for official updates:
Description: During the migration of website configurations database, the traffic has been redirected to the customers origin web servers for 32 minutes. For the majority of websites, the traffic has correctly been served by the origin. However, for a few websites, the origin didn’t succeed to do so due to origin configuration. ## Facts and Timeline All times are UTC\+2. * 3:30pm: migration starting from the legacy database to the new database after several days of testing on a fraction of the traffic in production and staging environments.. * 4:02pm: Platform health checks status gradually moves from 100% to 0%. Traffic is automatically routed to the customers’ origin. * 4:03pm: An alert indicates that some health checks are red. The alert is immediately taken into account by our tech team, and a crisis team is set up. * 4:14pm: The root cause is detected in the new database: record holding information necessary for the health checks response is incorrect. * 4:35pm: Health check configurations are changed to allow traffic to return to the platform. Incident ends for clients. * 5:45pm: Missing record fixed in the database. * 10/04: Health checks are set back to the original settings. ## Analysis On October 3, 2023, the deployment of a new database holding website configurations occurred. During the deployment, the platform health checks switched to an unhealthy state. Platform health checks consist of multiple monitors sending requests to the platform at regular intervals to validate that all layers in the platform are functional. When these requests fail, the traffic is automatically routed to the customers’ origin. After the migration, the health checks received 521 errors \(meaning that the relevant configuration for a given requested domain was not found\). The issue occurred because the deployment brought in a change in the logic involved in config loading. In the previous release, a request from the health checks was satisfied even if no configuration matched. In the current version, this is not possible. To quickly fix the issue, we created a configuration for health checks. This issue was not detected in our testing phases for the following reasons: * no alert is configured for health checks in our staging environment. * the health checks are not correctly covered by automatic testing By design, redirecting browser traffic to the origin when the platform is considered down is correct. However, we are seeing more and more cases where the origin cannot accept the traffic sent by browsers due to various reasons such as firewalls or incorrect certificates. We will improve our API to manage these edge cases. # Impacts * Number of customers impacted: all # Counter measures ## Short term 1. Set up alerting on our staging environment for health check 2. Add a test covering the health checks routes # Medium term * Design a way to avoid origin failover when the origin doesn’t support it
Status: Postmortem
Impact: Major | Started At: Oct. 3, 2023, 2:16 p.m.
Description: A DDOS attack made our cache layer for pages unavailable between 11:38 and 11:44.
Status: Resolved
Impact: Major | Started At: Aug. 16, 2023, 9:30 a.m.
Description: This incident has been resolved.
Status: Resolved
Impact: Minor | Started At: May 17, 2023, 3:43 p.m.
Description: This incident is now resolved.
Status: Resolved
Impact: Minor | Started At: May 12, 2023, 8:49 a.m.
Description: The rollback completed at 5:11 pm. We will conduct an investigation and improve our tests suite to improve coverage on the impacted feature. We are sorry for the inconvenience.
Status: Resolved
Impact: Minor | Started At: May 10, 2023, 3:02 p.m.
Join OutLogger to be notified when any of your vendors or the components you use experience an outage or down time. Join for free - no credit card required.