Last checked: 6 minutes ago
Get notified about any outages, downtime or incidents for Turtl and 1800+ other cloud vendors. Monitor 10 companies, for free.
Outage and incident data over the last 30 days for Turtl.
Join OutLogger to be notified when any of your vendors or the components you use experience an outage. It's completely free and takes less than 2 minutes!
Sign Up NowOutlogger tracks the status of these components for Xero:
Component | Status |
---|---|
Analytics reports | Active |
Assets generation | Active |
Editor | Active |
Fastly London (LON) | Active |
Filestack API | Active |
Personalization | Active |
Support web chat | Active |
Viewer | Active |
View the latest incidents for Turtl and check for official updates:
Description: This incident has been resolved
Status: Resolved
Impact: None | Started At: May 11, 2020, 2:21 p.m.
Description: This incident has been resolved
Status: Resolved
Impact: None | Started At: May 11, 2020, 2:21 p.m.
Description: ### Postmortem - Incident of 24th April 2020 #### Summary: An application update released at 8:47 am UTC introduced a performance regression which caused very complex Turtl documents to use 100% of the available CPU for a long period of time, blocking all other ongoing requests reaching the affected server. Requests were routed to healthy servers by our load balancer. At 09:09 UTC all servers became affected by this and Turtl was unavailable for all uncached requests for 6 minutes \(09:09 - 09:15\). We were notified of the issue by automated alarms at 09:10 and restored service at 09:15, after which there was no further downtime, although until full resolution a few further requests timed out. We are currently gathering data regarding the exact number of requests that were affected. Once service was restored we continued to monitor the situation while investigating the root cause. A rollback was released at 10:11. At 12:12 we identified the application change which introduced the regression and implemented a fix which was released at 14:11. #### Details The personalisation feature which allows our customers to send documents tailored to each recipient uses markers such as \`%company\_name%\` to adapt each personalised version of a document. To make the replacements, when the document is being rendered on the server we first find all occurrences of these markers and then substitute them for the correct value. This is done by running a simple regex on the document content. On the 24th April, a change to this regex was released which introduced a high-performance penalty when it was run on very complex documents, documents over 100 pages with many large inline tables. Due to the specific circumstances in which the performance issue occurred, this was not identified in our staging quality assurance tests as there are only a handful of these types of documents at present within Turtl. #### Further steps During our investigation of the issue, we identified that the marker extraction logic was running for each document being rendered, regardless of it being personalised or not. If this wouldn't have been the case, the downtime wouldn't have happened until much later in the feature's lifetime. Although incidents like these are unpleasant, we are grateful it was caught early. As part of the fix released at 14:11, apart from optimising the regex in question, we have also removed the marker replacement logic from the standard document rendering, moving it to only those routes that require it. To ensure these kinds of issues will in the future be caught in our pre-release steps we will be adding to our automated tests suite a list of specially crafted potentially problematic documents, such as the ones described above, and ensure these are loaded in all possible scenarios. #### Incident timeline \(UTC\) \* 08:48 - application release \* 09:09 - uncached requests stop being delivered \* 09:10 - automated alarms trigger \* 09:15 - service restored \* 10:11 - rollback release \* 12:12 - code change introducing the issue identified \* 14:11 - permanent fix released
Status: Postmortem
Impact: Critical | Started At: April 24, 2020, 9:29 a.m.
Description: Between approximately 2019-11-12 13:10 UTC and 2019-11-12 16:10 UTC the analytics were unavailable for users with author or viewer permissions due to a bad deploy. This issue has now been fully resolved and normal service has been restored.
Status: Resolved
Impact: Major | Started At: Nov. 12, 2019, 3:20 p.m.
Description: Between approximately 2019-11-12 13:10 UTC and 2019-11-12 16:10 UTC the analytics were unavailable for users with author or viewer permissions due to a bad deploy. This issue has now been fully resolved and normal service has been restored.
Status: Resolved
Impact: Major | Started At: Nov. 12, 2019, 3:20 p.m.
Join OutLogger to be notified when any of your vendors or the components you use experience an outage or down time. Join for free - no credit card required.