Last checked: 3 minutes ago
Get notified about any outages, downtime or incidents for UserVoice and 1800+ other cloud vendors. Monitor 10 companies, for free.
Outage and incident data over the last 30 days for UserVoice.
Join OutLogger to be notified when any of your vendors or the components you use experience an outage. It's completely free and takes less than 2 minutes!
Sign Up NowOutlogger tracks the status of these components for Xero:
Component | Status |
---|---|
Admin Console | Active |
Email (delivery) | Active |
Email (incoming) | Active |
Helpdesk API | Active |
Search | Active |
User Analytics | Active |
UserVoice API | Active |
Web Portal (subdomain) | Active |
Widgets | Active |
View the latest incidents for UserVoice and check for official updates:
Description: All systems have recovered appropriately.
Status: Resolved
Impact: Minor | Started At: July 13, 2018, 12:50 p.m.
Description: On June 12th, between 12:30 and 2:15 PDT UserVoice experienced an infrastructure outage that caused sitewide outages and system unavailability. **Business Impact** * During the outage end users and admins would have been unable to load or interact with UserVoice sites or widgets. * Email would have been delayed, but no emails were lost. **Root Cause** UserVoice periodically tests our ability to restore services from major data center outages. During this week’s test, an unexpected behavior in our orchestration tooling caused critical infrastructure components to be removed, including some of the tools needed to quickly restore services. The account used by our orchestration tool had more permissions than it needed to accomplish the goal. This configuration unintentionally violated our processes on granting permissions to the services we use. We misunderstood how the tool would react in this situation. The combination of the tool’s process and permissions it had led to the removal of infrastructure components. **What we are Doing to Prevent This** * Reduced the privileges that the orchestration program has to prevent it from inadvertently touching infrastructure assets that should not be touched. * Reviewing service account permission levels for all infrastructure tooling. * Perform major infrastructure testing at low activity times. * Increased documentation and training around proper system boot ordering, to allow for a faster recovery time in the future. We are continually working to improve the resiliency and capacity of our application, and have made very significant progress over the past several months. We aim to provide a service that is reliable, that will always be there for you, and are working hard to achieve that goal. In this case, we fell short. All of us take it seriously and offer our apologies and a commitment to improve. If you have any follow up questions, please, don’t hesitate to reach out. Claire Talbott Support Manager [email protected]
Status: Postmortem
Impact: Critical | Started At: June 12, 2018, 7:32 p.m.
Description: On June 12th, between 12:30 and 2:15 PDT UserVoice experienced an infrastructure outage that caused sitewide outages and system unavailability. **Business Impact** * During the outage end users and admins would have been unable to load or interact with UserVoice sites or widgets. * Email would have been delayed, but no emails were lost. **Root Cause** UserVoice periodically tests our ability to restore services from major data center outages. During this week’s test, an unexpected behavior in our orchestration tooling caused critical infrastructure components to be removed, including some of the tools needed to quickly restore services. The account used by our orchestration tool had more permissions than it needed to accomplish the goal. This configuration unintentionally violated our processes on granting permissions to the services we use. We misunderstood how the tool would react in this situation. The combination of the tool’s process and permissions it had led to the removal of infrastructure components. **What we are Doing to Prevent This** * Reduced the privileges that the orchestration program has to prevent it from inadvertently touching infrastructure assets that should not be touched. * Reviewing service account permission levels for all infrastructure tooling. * Perform major infrastructure testing at low activity times. * Increased documentation and training around proper system boot ordering, to allow for a faster recovery time in the future. We are continually working to improve the resiliency and capacity of our application, and have made very significant progress over the past several months. We aim to provide a service that is reliable, that will always be there for you, and are working hard to achieve that goal. In this case, we fell short. All of us take it seriously and offer our apologies and a commitment to improve. If you have any follow up questions, please, don’t hesitate to reach out. Claire Talbott Support Manager [email protected]
Status: Postmortem
Impact: Critical | Started At: June 12, 2018, 7:32 p.m.
Description: On May 22, 2018, from 10:04AM to 10:25AM EDT, UserVoice had a partial outage that resulted in slow response times or 500 errors for admins and end users. **Business Impact** Admins and end users could have experienced the following: * Widget, web portal or admin console being slow to load * 500 errors when trying to load the widget, web portal or admin console * Email delays (no emails were lost) **Root Cause** * During a schema migration, a misconfiguration introduced an inconsistency into our database cluster. This resulted in our application being unable to handle all requests. A temporary fix was implemented at 10:25AM EDT, and a permanent fix was rolled out at on May 23, 2018 at 2:42AM EDT. **What We are Doing to Prevent This** * Our engineers have implemented changes to our migration so this type of misconfiguration cannot occur again. We apologize for the pain point this caused for you and your users using UserVoice. This is something our entire engineering team takes very seriously, and if you have any questions or concerns, please reach out and let us know. Claire Talbott Support Manager
Status: Postmortem
Impact: Minor | Started At: May 22, 2018, 2:23 p.m.
Description: On Wednesday, May 16, 2018, from 12:01PM-12:31PM EDT, 14% of requests to UserVoice failed, resulting in 500 errors for admins and end users **Business Impact** Admins and end users impacted by the outage would have experienced the following: * API timeouts * Widget would not have loaded * 500 error when loading or interacting with the web portal (forums or knowledge bases) * 500 error when loading or interacting with the admin console Emails would have been delayed during the outage, but no emails were lost. **Root Cause** * Increased load on our database due to a data cleanup process led to degraded performance. **What we are Doing to Prevent This** * Scaling back our data cleanup process to run at off-peak hours and on smaller sets of data. * Modifying our auto-scaling capability to react more quickly under conditions of high load. We apologize for the pain point this caused for you and your team. Any type of down time is something we take very seriously. If you have any questions or concerns, please, do not hesitate to reach out! Claire Talbott Support Manager [email protected]
Status: Postmortem
Impact: Minor | Started At: May 16, 2018, 4:36 p.m.
Join OutLogger to be notified when any of your vendors or the components you use experience an outage or down time. Join for free - no credit card required.