Last checked: 6 minutes ago
Get notified about any outages, downtime or incidents for ServiceChannel and 1800+ other cloud vendors. Monitor 10 companies, for free.
Outage and incident data over the last 30 days for ServiceChannel.
Join OutLogger to be notified when any of your vendors or the components you use experience an outage. It's completely free and takes less than 2 minutes!
Sign Up NowOutlogger tracks the status of these components for Xero:
Component | Status |
---|---|
WorkForce | Active |
Analytics | Active |
Analytics Dashboard | Active |
Analytics Download | Active |
Data Direct | Active |
API | Active |
API Response | Active |
Authentication | Active |
Budget Insights | Active |
SendXML | Active |
SFTP | Active |
Universal Connector | Active |
Mobile Applications | Active |
SC Mobile | Active |
SC Provider | Active |
Provider Automation | Active |
Fixxbook | Active |
Invoice Manager | Active |
IVR | Active |
Login | Active |
Proposal Manager | Active |
Work Order Manager | Active |
Service Automation | Active |
Asset Manager | Active |
Compliance Manager | Active |
Dashboard | Active |
Inventory Manager | Active |
Invoice Manager | Active |
Locations List | Active |
Login | Active |
Maps | Active |
Project Tracker | Active |
Proposal Manager | Active |
Supply Manager | Active |
Weather | Active |
Work Order Manager | Active |
Service Center | Active |
Email - servicechannel.com | Active |
Email - servicechannel.net | Active |
Phone - Inbound | Active |
Phone - Outbound | Active |
Third Party Components | Active |
Avalara Tax Calculation Service | Active |
Rackspace - Inbound Email | Active |
Twilio REST API | Active |
Zendesk | Active |
View the latest incidents for ServiceChannel and check for official updates:
Description: **Intermittent Virtual Machine Network Connectivity Issue** - **Incident Report** **Date of Incident:** 10/12/2022 **Time/Date Incident Started:** 10/12/2022, 8:35am/pm EDT **Time/Date Stability Restored:** 10/12/2022, 10:39am/pm EDT **Time/Date Incident Resolved:** 10/12/2022, 10:45am/pm EDT **Users Impacted:** Many **Frequency:** Intermittent **Impact:** Major **Incident description:** Partial loss of network connectivity to a database read replica Virtual Machine \(VM\). **Root Cause Analysis:** The ServiceChannel Site Reliability Engineering \(SRE\) and Database Administration \(DBA\) teams discovered a partial loss of connectivity affecting a VM used by the main transactional database as a read replica. As a result of the connectivity issue, certain processes, including real-time database replication from the primary database replica to the impacted read replica, were delayed. This caused noticeable latency for certain kinds of data updates, and, in some cases, application performance degradation. The connectivity problem occurred in the network hardware layer that is managed by our cloud infrastructure partner, affecting a single read replica in the main transactional database. **Actions Taken:** 1. Investigated triggered alerts and degraded network functionality. 2. The SRE and DBA teams coordinated the redeployment and restart of the impacted VM. This moved the VM to a network segment that was functioning correctly. 3. Confirmed that full connectivity between the impacted VM and the primary database replica was restored and the database was operating normally again. **Mitigation Measures:** 1. Work with our cloud infrastructure partner’s support team to develop additional details pertaining to the nature of this failure to design a network strategy that can survive unavoidable network instability. 2. Enhance the current high availability remediation and disaster recovery provide for additional operational resiliency.
Status: Postmortem
Impact: Major | Started At: Oct. 12, 2022, 2:16 p.m.
Description: **SSO Login failures for some SAML/SSO customers - Incident Report** **Date of Incident:** 10/06/2022 **Time/Date Incident Started:** 10/06/2022, 7:00 AM EDT **Time/Date Stability Restored:** 10/06/2022, 3:55 PM EDT **Time/Date Incident Resolved:** 10/06/2022, 4:05 PM EDT **Users Impacted:** Few users **Frequency:** Continuous **Impact:** Major **Incident description:** Errors when attempting to authenticate using Single Sign-On for a subset of SAML SSO-enabled customers. **Root Cause Analysis:** Upon further investigation, the team responsible for managing the SAML SSO module determined that an undetected bug was introduced in software released 10/5/2022. This bug was not caught because it appears to only affect certain SAML SSO-enabled customers. As we do not have test accounts for every SAML SSO-enabled customer, our test coverage cannot find these edge cases. **Actions Taken:** 1. SRE team reviewed logs and determined that SSO authentication issues were confined to a very small subset of SAML SSO-enabled customers. 2. SRE team determined that the incident coincided with a login component application release the previous evening. 3. CICD team performed an emergency rollback of the login application components. **Mitigation Measures:** 1. Add monitoring to alert on increased SSO errors for our SAML SSO-enabled customers. 2. Release a fix for the underlying bug in the login component application.
Status: Postmortem
Impact: Major | Started At: Oct. 6, 2022, 7:26 p.m.
Description: **Date of Incident:** 09/16/2022 **Time/Date Incident Started:** 09/16/2022, 6:08pm EDT **Time/Date Stability Restored:** 09/16/2022, 6:27m EDT **Time/Date Incident Resolved:** 09/16/2022, 7:00pm EDT **Users Impacted:** All **Frequency:** Continuous **Impact:** Major **Incident description:** Unexpected failure of a primary database server. **Root Cause Analysis:** The Site Reliability Engineering \(SRE\) team identified the production primary database server for US datacenter was in an unresponsive state and determined that Azure had triggered an automated recovery/redeploy process due to a detected hypervisor hardware failure. The affected hypervisor was responsible for running our primary database virtual machine. SRE and Database teams monitored the failover to new hardware and verified that the redeployed virtual machine was operating properly. **Actions Taken:** 1. SRE team Investigated triggered alerts and identified a failed virtual machine for the US production master database server. 2. The Database team confirmed that the redeployed hardware was operating as expected. **Mitigation Measures:** 1. SRE Team opened Azure support case to get additional details pertaining to the nature of this failure. 2. DBA Team expanded and improved database clustering setup to eliminate single points of failure in database infrastructure.
Status: Postmortem
Impact: Major | Started At: Sept. 16, 2022, 10:21 p.m.
Description: **DNS Errors for Supply Manager Incident and Postmortem Report** **Date of Incident:** 08/30/2022 **Time/Date Incident Started:** 08/30/2022, 3:01 am EDT **Time/Date Stability Restored:** 08/30/2022, 9:23 am EDT **Time/Date Incident Resolved:** 08/30/2022, 10:06 am EDT **Users Impacted:** Few **Frequency:** Intermittent **Impact:** Minor **Incident description:** Customers that have Supply Manager enabled encountered an undefined error during login. **Root Cause Analysis:** The SRE team responded to internal alerts triggered against the Supply Manager component and confirmed this problem was impacting specific virtual machines running the Ubuntu operating system. The Azure statuspage acknowledged an issue for Ubuntu 18.04, where the latest operating system updates resulted in DNS errors when accessing URL resources. The SRE team forwarded these details to the dedicated support for the managed service. This support team was then able to successfully restore service to the supply manager component for the ServiceChannel Platform. Reference to the Azure incident: [https://app.azure.com/h/2TWN-VT0/05a585](https://app.azure.com/h/2TWN-VT0/05a585) Virtual Machines - DNS errors when accessing resources **Actions Taken:** 1. The SRE team verified the issue was isolated to the managed service responsible for the supply manager application. 2. The SRE team identified a temporary workaround for the problem and forwarded the details to the vendor. **Mitigation Measures:** 1. Recommendations to the vendor to add synthetic checks that could aid in earlier detection of these types of issues.
Status: Postmortem
Impact: Critical | Started At: Aug. 30, 2022, 8:42 a.m.
Description: This incident has been resolved.
Status: Resolved
Impact: None | Started At: June 30, 2022, 8:03 p.m.
Join OutLogger to be notified when any of your vendors or the components you use experience an outage or down time. Join for free - no credit card required.