Last checked: 3 minutes ago
Get notified about any outages, downtime or incidents for Fasterize and 1800+ other cloud vendors. Monitor 10 companies, for free.
Outage and incident data over the last 30 days for Fasterize.
Join OutLogger to be notified when any of your vendors or the components you use experience an outage. It's completely free and takes less than 2 minutes!
Sign Up NowOutlogger tracks the status of these components for Xero:
Component | Status |
---|---|
Acceleration | Active |
API | Active |
CDN | Active |
Collect | Active |
Dashboard | Active |
Logs delivery | Active |
Website | Active |
View the latest incidents for Fasterize and check for official updates:
Description: There is a delay in indexing CDN logs due to network maintenance. The CDN log indexing platform is behind, and the currently available logs date back to before Wednesday, June 5, 2024 at 1:45 PM. As a result, last night's log extractions for the June 5 logs are incomplete. This delay will be caught up in the next few hours. We will keep you informed.
Status: Resolved
Impact: None | Started At: June 6, 2024, 8:32 a.m.
Description: **Post Mortem**: Temporary Platform Unavailability **Event Date**: May 15, 2023 **Incident Duration**: 11:29 AM to 11:55 AM **Incident Description**: The platform experienced an outage from 11:29 AM to 11:55 AM. Traffic was automatically routed to origins. Customers therefore lost the benefit of the solution, but the sites remained available during the incident. The addition of a large number of configurations on the platform increased the consumed memory and the startup time of the front layer services. Some services stopped and did not start correctly. **Event Timeline**: * 11:17 AM: Addition of new configurations. * 11:21 AM: Detection of a memory shortage on a service, leading to the shutdown of a critical process. * 11:34 AM: Additional services become unavailable. * 11:38 AM: Widespread detection of the incident; automatic traffic redirection. * 11:45 AM: Attempts to restart services, partially successful. * 12:00 PM - 12:15 PM: Assessment and decision-making on corrective actions. * 12:33 PM: Modification of startup configurations to improve tolerance to startup time. **Analysis**: Two main factors lead to this incident : * our HTTP server requires a reload in order to load new configuration into account. During this reload, the number of processes for this service is doubled, leading to a risk of memory exhaustion. * The start timeout for the HTTP service was set as the default value and we didn’t have a monitor alerting us that the HTTP service start time was close to the limit. **Impact**: All users of the platform were affected by this incident. **Corrective and Preventive Measures**: * Short term: Review of alert systems and adjustment of service startup configurations. * Medium term: Improvement in configuration management to reduce their number and optimize service startup monitoring. * Long term: Researching alternative HTTP server to improve update management without impacting performance or memory consumption. **Conclusion**: This incident highlights the importance of constant monitoring and proactive resource management to prevent outages. The measures taken should enhance the stability and reliability of the platform.
Status: Postmortem
Impact: Minor | Started At: May 15, 2024, 9:30 a.m.
Description: We had some issues on our european infrastructure. Now fixed. Speeding-up was disabled but trafic was ok.
Status: Resolved
Impact: Minor | Started At: April 11, 2024, noon
Description: This incident has been resolved.
Status: Resolved
Impact: None | Started At: Nov. 2, 2023, 5:57 p.m.
Description: # Description On Thursday, October 19th, between 4:55 PM UTC\+2 and 6:25 PM UTC\+2, Fasterize european platform was unable to optimize web pages for all customers. The original version was then delivered. We discovered that between 4:45 PM UTC\+2 and 5:50 PM UTC\+2, a specific request was made that caused a failure in the Fasterize engine during optimization and left the process in a non-functional state. The number of functional processes then decreased until it fell below a critical threshold. Our engine then automatically switched to a degraded mode where pages were no longer optimized and served without delay. At 5:29 PM UTC\+2, the oncall team manually added capacity to the platform to return to a stable state, but this did not definitely improve the situation. Starting from 6:15 PM UTC\+2, the optimization processes gradually resumed traffic. The engine then returned to its normal mode of operation. To prevent any further incidents, the request has been excluded from optimizations and a fix on the optimization engine is being developed. ## Action plan **Short term:** * Fix the engine to optimize the responsible request without any crashes **Medium term:** * Review the health check system at the engine level to automatically restart non-functional processes
Status: Postmortem
Impact: Minor | Started At: Oct. 19, 2023, 4:04 p.m.
Join OutLogger to be notified when any of your vendors or the components you use experience an outage or down time. Join for free - no credit card required.