Last checked: 3 minutes ago
Get notified about any outages, downtime or incidents for Signiant and 1800+ other cloud vendors. Monitor 10 companies, for free.
Outage and incident data over the last 30 days for Signiant.
Join OutLogger to be notified when any of your vendors or the components you use experience an outage. It's completely free and takes less than 2 minutes!
Sign Up NowOutlogger tracks the status of these components for Xero:
Component | Status |
---|---|
Flight | Active |
Flight CLI Transfers | Active |
Flight Gateway Transfers | Active |
Management Interface | Active |
Flight Deck | Active |
APIs | Active |
Console | Active |
Transfers | Active |
Jet | Active |
APIs | Active |
Console | Active |
Transfers | Active |
Media Shuttle | Active |
CloudSpex | Active |
Cloud Transfers | Active |
Console | Active |
Management API | Active |
Portal Interface | Active |
Transfer API | Active |
v1 | Active |
v2 | Active |
View the latest incidents for Signiant and check for official updates:
Description: Between 13:25 EDT and 13:45 EDT intermittent connection errors to Media Shuttle transfer servers in Azure were observed. Infrastructure was immediately scaled to address the observed connection issues and error rates returned to normal. No other cloud object storage transfers were affected.
Status: Resolved
Impact: None | Started At: June 3, 2024, 5:30 p.m.
Description: Between 2:58 PM EDT and 5:30 PM EDT on Tuesday, June 13th, service interruptions within AWS \(us-east-1 region\) caused service degradation for multiple Signiant services. \(AWS has not yet posted a post event summary for this incident, but it should be available at some point in the future: [https://aws.amazon.com/premiumsupport/technology/pes/](https://aws.amazon.com/premiumsupport/technology/pes/)\) At 2:58 PM EDT our monitoring alerted us to a problem with Signiant service logins. The Signiant Status page was updated to this effect. At this point we initiated failing over impacted services to our backup region. Additional alerts notified us that the problem was more widespread than just console logins. The Signiant Status page was updated to reflect that additional services were affected. At 3:08 PM EDT AWS posted a message on their status page indicating an issue with lambda in the us-east-1 region. Once failover of impacted services was complete, Signiant Console logins recovered, but we continued to see increased error rates when attempting to browse Media Shuttle share portals. Investigation continued on that front and a configuration error that impacted operation in the failover region was uncovered with a microservice involved with portal browsing. Once this configuration error was resolved, share portal browsing recovered. It should be noted that ongoing transfers were not affected throughout this incident, but given the impact on login and share portals browsing, it was not possible to start new transfers under some circumstances. Based on information obtained during the investigation, customers may have also experienced intermittent issues with Jet Hot Folder jobs between 3:00 PM EDT and 4:40 PM EDT. The impact of this was possible delays in processing the hot folder events, but due to built in retries, all events were eventually processed and transfers completed. Signiant SaaS services are designed to withstand major outages in cloud provider infrastructure without customer impact. Our services run in multiple regions and multiple availability zones within each region. Some services are active in multiple regions at the same time \(e.g. transfer services\) and others failover between regions when there is an underlying cloud provider issue. With this specific incident we experienced issues with regional failover for some of our services, and although our services recovered more rapidly than the underlying AWS services, we have identified several opportunities for improvement. In particular, we use internal tooling to automate failover between regions and during this incident the effectiveness of our tooling was impacted by a regional dependency. Accordingly we had to revert to manual failover of our services which had an impact on the time required for failover. While we regularly test service failover, the impact on time to failover for this specific failure scenario was not covered in our testing. Going forward, we are making changes to ensure that all tooling that impacts time to failover is appropriately redundant and that we incorporate scenarios that might impact the effectiveness of our tooling in our failover testing.
Status: Postmortem
Impact: Minor | Started At: June 13, 2023, 7:02 p.m.
Description: This incident has been resolved. Transfers were unaffected.
Status: Resolved
Impact: Minor | Started At: May 30, 2023, 3:29 p.m.
Description: This incident has been resolved. Transfers were unaffected.
Status: Resolved
Impact: Minor | Started At: May 30, 2023, 3:29 p.m.
Description: Between 1:39 PM EST and 2:28 PM EST on Thursday, February 23rd, service degradation of AWS IoT \(in the us-east-1 region\) caused intermittent connectivity issues for Signiant SDCX servers. Signiant uses AWS IoT to exchange messages with SDCX servers without the need for inbound HTTP access to the SDCX Server. Between 1:44 PM EST and 1:53 PM EST our monitoring alerted us to a problem with browsing share portals backed by SDCX server based storage. During the course of the investigation into the above failures, our monitoring also alerting us to a potential issue with Jet transfers. The AWS status page did not indicate any problems with AWS services at this time. In an attempt to mitigate the issues highlighted by our monitoring, we opted to fail over services that appeared to be showing increased error rates to another region while we continued to investigate. In addition, we made the decision to re-route IoT traffic to another region as well. These changes were made between 2:17 PM EST and 2:21 PM EST. We immediately began to see improvement after the change to the IoT endpoint and subsequent investigation of the AWS status page showed an incident posted there for IoT in the us-east-1 region. Details as follows: ``` [10:57 AM PST] We are investigating increased API error rates in the US-EAST-1 Region. [11:22 AM PST] We have identified the root cause for the elevated API rates and latency in the US-EAST-1 Region and are working towards recovery. [11:34 AM PST] Between 10:39 AM and 11:28 AM PST, we experienced elevated API errors and latency for Publish operation in the US-EAST-1 Region. The issue has been resolved and the service is operating normally. ``` With all of our monitoring systems recovering after the above change to the IoT endpoint, we continued to closely monitor services and eventually closed out this incident at 3:11 PM EST. Signiant SaaS services are designed to withstand major outages in cloud provider infrastructure; however, we do rely on AWS services for connectivity with SDCX servers, and in this case, those services were impacted. Going forward, we are investigating enhancements that allow us to more quickly react to issues with the AWS IoT service.
Status: Postmortem
Impact: Minor | Started At: Feb. 23, 2023, 7:21 p.m.
Join OutLogger to be notified when any of your vendors or the components you use experience an outage or down time. Join for free - no credit card required.