Last checked: 5 minutes ago
Get notified about any outages, downtime or incidents for Holaspirit and 1800+ other cloud vendors. Monitor 10 companies, for free.
Outage and incident data over the last 30 days for Holaspirit.
Join OutLogger to be notified when any of your vendors or the components you use experience an outage. It's completely free and takes less than 2 minutes!
Sign Up NowOutlogger tracks the status of these components for Xero:
Component | Status |
---|
View the latest incidents for Holaspirit and check for official updates:
Description: Hello 👋 We do apologize for this massive inconvenience that has now been corrected. It seems that the link between Holaspirit and the Support chat was broken in a way that all users were anonymized and inherently put in the same conversation. This had no impact whatsoever on your company data, no data breach was detected. If you refresh the Holaspirit page, everything will go back to normal, if not the platform will automatically refresh by 3PM European time. Our team is working as we speak to make sure this doesn't happen again. Thank you for your patience and understanding. 🙏 The Holaspirit Team
Status: Resolved
Impact: Major | Started At: June 30, 2022, 10 p.m.
Description: Thank you for your patience. We experienced some network issues yesterday due to the maintenance of our Cloud Provider OVH. The service is back to normal.
Status: Resolved
Impact: Minor | Started At: June 15, 2022, 1:44 p.m.
Description: We experienced a higher than normal amount of load, that caused pages to be slow or unresponsive. The issue started at 9:45 and was resolved at 10:15.
Status: Resolved
Impact: Minor | Started At: May 19, 2022, 7:30 a.m.
Description: The incident is closed by OVH. Access to Holaspirit is available. End time : 13/10/2021 07:00 UTC Service impact : short period of unavailability with slowness Check OVH statuses on http://travaux.ovh.net/?do=details&id=53799
Status: Resolved
Impact: Major | Started At: Oct. 13, 2021, 7:40 a.m.
Description: ### Incident summary On September 23th, we encountered several problems that caused a degradation of service. The Elasticsearch Server stopped indexing new data. There was no data loss, but several customers could not access the application modules. ### What happened * Date: Incident occurred on the 23rd of September 2021 on Holaspirit. * Gravity: Critical * Impact: The Holaspirit service incorrectly responded to some requests. Parts of the software weren’t usable. * Reference: Elasticsearch server unavailable * Summary: Following a first incident on the Mongodb database the server Elasticsearch overloaded and couldn’t handle the Holaspirit software requests * Tags: #mongodg #migration #elasticsearch ### Corrective Actions Complete Bitwarden implementation with CSM team and ensure accesses are working Automation of the production staging database duplication process to test scripts Improve Elasticsearch application monitoring Improve conflict management on Elasticsearch Improved handling of conflict issues on Elasticsearch Implement a Rate Limit on the API with client or user deactivation Improve Documentation on the use of the API with constraints Update the incident management process ### Experience feedback What worked? * Internal communication allowed the issue to be quickly identified. Regular reviews were implemented to assess the situation. * The resolution of the first incident was correctly handled with a simple solution. What didn’t work? * Backend brought up an error early on without comprehending the issue’s gravity. * Elasticsearch’s issue was difficult to resolve. * The research for information to identify the origin for the number of abnormal requests took a long time and it wasn’t possible to stop them. * Configuration modification was done without the problem being resolved. Where did we get lucky ? * The issue was rapidly identified by the CTO and actions to get the service back on track were quite quick. * It was possible to increase resources rapidly. * The modification of Elasticsearch’s configuration allowed to diminish the impact of the issue on the service. ### Chronology **September, Thursday 23rd** * 7:45 release of the backend package on production * 7:50 first logs raised on Grafana mentioning an issue occured on tensions * 8:20 Backend Dev raises the tensions error he doesn’t understand in the DevOps channel * 8:50 CTO says the error must come from a data issue * 9:00 CSM raises the incident having an impact on all users * 9:20 CSM manages to reach CTO in his meeting to explain the gravity of the error * 9:20 CTO tells BackEnd Dev that the issue previously raised has an impact on clients * 9:30 CTO identifies that the migration done earlier in the morning generated the bug * 9:50 CTO makes the decision to get the morning backup back and to only restore tensions * 10:00 CSM communicates the incident on Statuspage \([https://status.holaspirit.com/incidents/kfvq7qrzqd7p](https://status.holaspirit.com/incidents/kfvq7qrzqd7p)\) after having looked for access to log onto the service * 10:15 The problem is solved by reindexing the tensions, some customer confirm it’s fixed * 10:20 Communication on Statuspage * 10:30 Errors 500 are raised on production, the Elasticsearch server is overloaded * 10:45 Restart of the Elasticsearch server with more memory and CPU, however the server still overloads * 11:00 Videoconference to review the situation * 12:45 Modification of the Elasticsearch server’s configuration * 14:00 The service is stable but a pattern is identified on production, meaning regular peaks every 20 minutes * 15:00 BackEnd Dev identifies the client making abnormal requests : BO1 * 15h45 BO1 raises an issue on SCIM that was identified earlier in the day * 16:OO BackEnd Dev is taking charge of the relation with BO1 on technical aspects and raises he’s had issues with the 5.9 trial version of Elasticsearch used in production * 19:00 overload peaks disappear on production, monitoring has been added on Elastisearch to identify the correlations between the requests and the state of the server **September, Friday 24th** * 00:00 restart adding more CPU * 11:30 videoconference to review the situation, Friday morning everything is stable * 14:00 the incident is closed on Statuspage
Status: Postmortem
Impact: None | Started At: Sept. 23, 2021, 7:37 a.m.
Join OutLogger to be notified when any of your vendors or the components you use experience an outage or down time. Join for free - no credit card required.