Last checked: 3 minutes ago
Get notified about any outages, downtime or incidents for Files.com and 1800+ other cloud vendors. Monitor 10 companies, for free.
Outage and incident data over the last 30 days for Files.com.
Join OutLogger to be notified when any of your vendors or the components you use experience an outage. It's completely free and takes less than 2 minutes!
Sign Up NowOutlogger tracks the status of these components for Xero:
Component | Status |
---|---|
Australia Region | Active |
Background Jobs, including Sync and Webhooks | Active |
Canada Region | Active |
Core Services / API | Active |
EU (Germany) Region | Active |
Files Tools | Active |
FTP/FTPS | Active |
Japan Region | Active |
Remote Server Integrations (Sync and Mount) | Active |
SFTP | Active |
Singapore Region | Active |
UK Region | Active |
USA Region | Active |
WebDAV | Active |
Web Interface | Active |
View the latest incidents for Files.com and check for official updates:
Description: On May 8th, 2023, at 1:39 PM PST, [Files.com](http://Files.com) received automated alerting of SFTP entirely down in the US East region which resulted in an incident being declared. The Incident Management Team \(IMT\) convened and immediately began investigation. [Files.com](http://Files.com) released an initial Status Page posting on May 8th, 2023, at 1:47 PM PST stating: _**“SFTP Entirely Down – US East Region \(Primary\):** SFTP only: We are investigating a major outage of the SFTP service on_ [_Files.com_](http://Files.com) _in our primary USA region._ _This incident does not impact other network services such as API, FTP, WebDAV, AS2, and others._ _If you have an urgent need to access_ [_Files.com_](http://Files.com)_, we recommend using FTP in lieu of SFTP. If you must connect via SFTP, you should be able to immediately connect \(and access your existing files and account\) using the hostname of our Canada region, which is_ [_app-ca-central-1.files.com_](http://app-ca-central-1.files.com)_._ _We will provide additional details as they become available. Customers with urgent questions are encouraged to contact our Customer Support team by email. Thank you for your patience.”_ The SFTP entirely down in the US East region was resolved on May 8th, 2023, at 1:47 PM PST, returning the platform to full functionality. [Files.com](http://Files.com) released a resolution Status Page posting on May 8th, 2023, at 1:51 PM PST stating _“All services have been restored and are operating normally._ _We have resolved a major outage of the SFTP service on_ [_Files.com_](http://Files.com) _in our primary USA region. This incident did not impact other network services such as API, FTP, WebDAV, AS2, and others. The SFTP service was down from 1:34 p.m. to 1:47 p.m., with a total downtime of 13 minutes, but only in the primary USA region._ _If you previously moved any workloads to another region in response to this incident, you are cleared to move those regional workloads back to the USA region._ _We will follow up with an Incident Report within ten \(10\) business days including the root cause and steps taken to address the root cause. If you need additional support, please do not hesitate to contact our Customer Support team by email or phone. Thanks for your support while we resolved this issue.”_ This incident occurred during a time period that also contained multiple other incidents, some of which are overlapping. This report focuses specifically on the symptoms described here, but many customers who experienced this incident also experienced one of the other incidents. This incident had two distinct parts and root causes. First, [Files.com](http://Files.com) deployed a change to its SFTP server as part of our overall project to dramatically improve the logging and handling of errors on SFTP. The deployment of that change crashed our SFTP servers in several of our smaller regions due to an “out of memory” condition. Our SFTP server is developed in Java, and anyone familiar with Java can tell you how sensitive Java can be to memory configuration settings. We immediately identified the issue with the Java memory settings and pushed a change to Chef, our infrastructure configuration management system, to tweak the SFTP memory settings and resolve the initial crash. The root cause of this first part was [Files.com](http://Files.com)’s failure to monitoring Java runtime parameters such as memory usage to defend against an out of memory condition. We have added additional monitoring around Java memory usage and are optimistic that this situation will be avoided in the future. One benefit of the [Files.com](http://Files.com) architecture as compared with many of our peers is that on [Files.com](http://Files.com), SFTP is a completely isolated subsystem, so this incident did not impact other network services such as FTP, AS2, WebDAV, or API. Unfortunately, when we deployed the configuration change via Chef, we inadvertently deployed an unrelated configuration change at the same time that had been previously merged but not deployed to the SFTP servers. This is due to the fact that we use one unified Chef repository for server configuration where certain recipes can be shared by different server types. That configuration change introduced an error into the upstream communication with our API, resulting in inability to connect via SFTP for certain customers. After investigating the issue, we were able to identify the bad configuration change and revert it. The root cause of the second part is [Files.com](http://Files.com)’s failure to operate adequate change management procedures to prevent an unintended change from being deployed. Our incident management team was quite disappointed to learn about the chain of events that led to this incident. We have already improved our internal synthetic monitoring systems with the ability to detect the situation that occurred during this incident and alert on it immediately. Additionally, as a result of this incident, we are implementing major changes to our change management procedures designed to prevent this sort of configuration management error from happening again. Those changes are fairly complicated and will require a great deal of internal development. As such, they will likely not be deployed until the middle of Q3. It is our goal to have them implemented before our next SOC 2 Type II observation period \(which runs from Q2-Q3 2023\) and documented in our next SOC 2 Type II report. On a more general note, we have added a considerable amount of sophistication to our monitoring and routing systems as a result of the several incidents that occurred in May, and we are adding more. These improvements amount to over 5,000 lines of code and we are optimistic that they will reduce the frequency and impact of incidents in the future. We hope to share more about the improvements in our next SOC 2 Type II report. We greatly appreciate your patience and understanding as we resolved this issue. If you need additional assistance or continue to experience issues, please contact our Customer Support team.
Status: Postmortem
Impact: Critical | Started At: May 8, 2023, 8:47 p.m.
Description: On May 8th, 2023, at 1:39 PM PST, [Files.com](http://Files.com) received automated alerting of SFTP entirely down in the US East region which resulted in an incident being declared. The Incident Management Team \(IMT\) convened and immediately began investigation. [Files.com](http://Files.com) released an initial Status Page posting on May 8th, 2023, at 1:47 PM PST stating: _**“SFTP Entirely Down – US East Region \(Primary\):** SFTP only: We are investigating a major outage of the SFTP service on_ [_Files.com_](http://Files.com) _in our primary USA region._ _This incident does not impact other network services such as API, FTP, WebDAV, AS2, and others._ _If you have an urgent need to access_ [_Files.com_](http://Files.com)_, we recommend using FTP in lieu of SFTP. If you must connect via SFTP, you should be able to immediately connect \(and access your existing files and account\) using the hostname of our Canada region, which is_ [_app-ca-central-1.files.com_](http://app-ca-central-1.files.com)_._ _We will provide additional details as they become available. Customers with urgent questions are encouraged to contact our Customer Support team by email. Thank you for your patience.”_ The SFTP entirely down in the US East region was resolved on May 8th, 2023, at 1:47 PM PST, returning the platform to full functionality. [Files.com](http://Files.com) released a resolution Status Page posting on May 8th, 2023, at 1:51 PM PST stating _“All services have been restored and are operating normally._ _We have resolved a major outage of the SFTP service on_ [_Files.com_](http://Files.com) _in our primary USA region. This incident did not impact other network services such as API, FTP, WebDAV, AS2, and others. The SFTP service was down from 1:34 p.m. to 1:47 p.m., with a total downtime of 13 minutes, but only in the primary USA region._ _If you previously moved any workloads to another region in response to this incident, you are cleared to move those regional workloads back to the USA region._ _We will follow up with an Incident Report within ten \(10\) business days including the root cause and steps taken to address the root cause. If you need additional support, please do not hesitate to contact our Customer Support team by email or phone. Thanks for your support while we resolved this issue.”_ This incident occurred during a time period that also contained multiple other incidents, some of which are overlapping. This report focuses specifically on the symptoms described here, but many customers who experienced this incident also experienced one of the other incidents. This incident had two distinct parts and root causes. First, [Files.com](http://Files.com) deployed a change to its SFTP server as part of our overall project to dramatically improve the logging and handling of errors on SFTP. The deployment of that change crashed our SFTP servers in several of our smaller regions due to an “out of memory” condition. Our SFTP server is developed in Java, and anyone familiar with Java can tell you how sensitive Java can be to memory configuration settings. We immediately identified the issue with the Java memory settings and pushed a change to Chef, our infrastructure configuration management system, to tweak the SFTP memory settings and resolve the initial crash. The root cause of this first part was [Files.com](http://Files.com)’s failure to monitoring Java runtime parameters such as memory usage to defend against an out of memory condition. We have added additional monitoring around Java memory usage and are optimistic that this situation will be avoided in the future. One benefit of the [Files.com](http://Files.com) architecture as compared with many of our peers is that on [Files.com](http://Files.com), SFTP is a completely isolated subsystem, so this incident did not impact other network services such as FTP, AS2, WebDAV, or API. Unfortunately, when we deployed the configuration change via Chef, we inadvertently deployed an unrelated configuration change at the same time that had been previously merged but not deployed to the SFTP servers. This is due to the fact that we use one unified Chef repository for server configuration where certain recipes can be shared by different server types. That configuration change introduced an error into the upstream communication with our API, resulting in inability to connect via SFTP for certain customers. After investigating the issue, we were able to identify the bad configuration change and revert it. The root cause of the second part is [Files.com](http://Files.com)’s failure to operate adequate change management procedures to prevent an unintended change from being deployed. Our incident management team was quite disappointed to learn about the chain of events that led to this incident. We have already improved our internal synthetic monitoring systems with the ability to detect the situation that occurred during this incident and alert on it immediately. Additionally, as a result of this incident, we are implementing major changes to our change management procedures designed to prevent this sort of configuration management error from happening again. Those changes are fairly complicated and will require a great deal of internal development. As such, they will likely not be deployed until the middle of Q3. It is our goal to have them implemented before our next SOC 2 Type II observation period \(which runs from Q2-Q3 2023\) and documented in our next SOC 2 Type II report. On a more general note, we have added a considerable amount of sophistication to our monitoring and routing systems as a result of the several incidents that occurred in May, and we are adding more. These improvements amount to over 5,000 lines of code and we are optimistic that they will reduce the frequency and impact of incidents in the future. We hope to share more about the improvements in our next SOC 2 Type II report. We greatly appreciate your patience and understanding as we resolved this issue. If you need additional assistance or continue to experience issues, please contact our Customer Support team.
Status: Postmortem
Impact: Critical | Started At: May 8, 2023, 8:47 p.m.
Description: On May 5th, 2023 at 11:39 AM, [Files.com](http://Files.com) received internal monitoring alerts of issues related to DNS, which resulted in an incident being declared. The Incident Management Team \(IMT\) convened and immediately began investigation. The DNS resolution issue was resolved on May 5th, 2023, at 3:20 PM, returning the platform to full functionality. [Files.com](http://Files.com) released an initial investigation to its Status Page posting on May 5th, 2023, at 12:00 PM PST stating: _“All regions: We are investigating a major network outage on_ [_Files.com_](http://files.com/) _affecting_ [_Files.com_](http://files.com/) _services in all regions. This outage is affecting our gateway networking in regions other than USA, and our Core Services are running correctly._ _At this time, we are also investigating elevated error rates in our primary USA region._ _We will provide additional details as they become available. Customers with urgent questions are encouraged to contact our Customer Support team by email. Thank you for your patience_.” [Files.com](http://Files.com) released an updated investigation to its Status Page posting on May 5th, 2023, at 12:58 PM PST stating: _“We are continuing to investigate this issue. We will post an update as soon as the issue has been identified and a fix is being implemented. If you need additional assistance, please do not hesitate to contact our Customer Support team by email. Thank you for your continued patience.”_ [Files.com](http://Files.com) released a second updated investigation to its Status Page posting on May 5th, 2023, at 14:03 PM PST stating: _“We are continuing to investigate the DNS issue causing_ [_Files.com_](http://files.com/) _sites to be inaccessible for some users. We will post an update as soon as the issue has been identified and a fix is being implemented._ _If you need additional assistance, please do not hesitate to contact our Customer Support team by email. Thank you for your continued patience.”_ [Files.com](http://Files.com) released a resolution Status Page posting on May 5th, 2023, at 3:30 PM PST stating: _“All services have been restored and are operating normally. We have identified and resolved the root cause underlying these DNS issues._ _We will follow up with an Incident Report within one business day including the root cause and steps taken to address the root cause. If you need additional support, please do not hesitate to contact our Customer Support team by email or phone.”_ This incident started when a prominent FinTech company sent a fraudulent and erroneous report accusing [Files.com](http://Files.com) of cybercrime to GoDaddy, the .COM domain name registrar where the [Files.com](http://Files.com) domain name is registered. This FinTech company is a [Files.com](http://Files.com) customer, and the report was sent in error. Upon receipt of the accusation, GoDaddy chose to immediately suspend the [Files.com](http://Files.com) domain without providing [Files.com](http://Files.com) any advance notice or warning. Once the domain suspension was identified, we worked with GoDaddy and the source of the erroneous cybercrime report to escalate and respond to the erroneous report, which ultimately resulted in GoDaddy removing the suspension from the [Files.com](http://Files.com) domain, returning the services back to functionality. Unfortunately, escalating this situation within GoDaddy took a disappointingly long time and we ended up with a 3\+ hour downtime. We are thoroughly disappointed with the way they handled the situation. You may be questioning why we use GoDaddy as our domain registrar, and we want to speak to that briefly. They're actually a customer of ours and neighbor of ours in Scottsdale, AZ, and we have a generally good relationship with the company. With that said, we were already in the process of moving all of our domain registrations to CSC Domains, an enterprise-focused and security-focused domain registrar whose business is structured to prevent exactly these sorts of mishaps. Although we have our GoDaddy account secured as strongly as possible, including with two-factor authentication, CSC Domains offers a much stronger level of enterprise security and protections, and domain registrar risks are something we identified in a previous meeting of our risk committee. Unfortunately, we missed the mark on timing, because we ended up having an incident with GoDaddy prior to completing that migration. As of Monday, May 8th, 2023, we are actively underway with migrating the [Files.com](http://Files.com) domain \(and all other domains that we own\) to CSC. The transfer process, which is controlled by GoDaddy, can take up to a week to complete. Ultimately, the root cause of this incident was [Files.com](http://Files.com)’s use of GoDaddy as its domain registrar and our failure to complete the project to switch to CSC Domains in a more timely manner. We recognize the impact that this incident has had on our customers. We greatly appreciate your patience and understanding as we resolved this issue. If you need additional assistance or continue to experience issues, please contact our Customer Support team.
Status: Postmortem
Impact: Major | Started At: May 5, 2023, 7 p.m.
Description: On May 5th, 2023 at 11:39 AM, [Files.com](http://Files.com) received internal monitoring alerts of issues related to DNS, which resulted in an incident being declared. The Incident Management Team \(IMT\) convened and immediately began investigation. The DNS resolution issue was resolved on May 5th, 2023, at 3:20 PM, returning the platform to full functionality. [Files.com](http://Files.com) released an initial investigation to its Status Page posting on May 5th, 2023, at 12:00 PM PST stating: _“All regions: We are investigating a major network outage on_ [_Files.com_](http://files.com/) _affecting_ [_Files.com_](http://files.com/) _services in all regions. This outage is affecting our gateway networking in regions other than USA, and our Core Services are running correctly._ _At this time, we are also investigating elevated error rates in our primary USA region._ _We will provide additional details as they become available. Customers with urgent questions are encouraged to contact our Customer Support team by email. Thank you for your patience_.” [Files.com](http://Files.com) released an updated investigation to its Status Page posting on May 5th, 2023, at 12:58 PM PST stating: _“We are continuing to investigate this issue. We will post an update as soon as the issue has been identified and a fix is being implemented. If you need additional assistance, please do not hesitate to contact our Customer Support team by email. Thank you for your continued patience.”_ [Files.com](http://Files.com) released a second updated investigation to its Status Page posting on May 5th, 2023, at 14:03 PM PST stating: _“We are continuing to investigate the DNS issue causing_ [_Files.com_](http://files.com/) _sites to be inaccessible for some users. We will post an update as soon as the issue has been identified and a fix is being implemented._ _If you need additional assistance, please do not hesitate to contact our Customer Support team by email. Thank you for your continued patience.”_ [Files.com](http://Files.com) released a resolution Status Page posting on May 5th, 2023, at 3:30 PM PST stating: _“All services have been restored and are operating normally. We have identified and resolved the root cause underlying these DNS issues._ _We will follow up with an Incident Report within one business day including the root cause and steps taken to address the root cause. If you need additional support, please do not hesitate to contact our Customer Support team by email or phone.”_ This incident started when a prominent FinTech company sent a fraudulent and erroneous report accusing [Files.com](http://Files.com) of cybercrime to GoDaddy, the .COM domain name registrar where the [Files.com](http://Files.com) domain name is registered. This FinTech company is a [Files.com](http://Files.com) customer, and the report was sent in error. Upon receipt of the accusation, GoDaddy chose to immediately suspend the [Files.com](http://Files.com) domain without providing [Files.com](http://Files.com) any advance notice or warning. Once the domain suspension was identified, we worked with GoDaddy and the source of the erroneous cybercrime report to escalate and respond to the erroneous report, which ultimately resulted in GoDaddy removing the suspension from the [Files.com](http://Files.com) domain, returning the services back to functionality. Unfortunately, escalating this situation within GoDaddy took a disappointingly long time and we ended up with a 3\+ hour downtime. We are thoroughly disappointed with the way they handled the situation. You may be questioning why we use GoDaddy as our domain registrar, and we want to speak to that briefly. They're actually a customer of ours and neighbor of ours in Scottsdale, AZ, and we have a generally good relationship with the company. With that said, we were already in the process of moving all of our domain registrations to CSC Domains, an enterprise-focused and security-focused domain registrar whose business is structured to prevent exactly these sorts of mishaps. Although we have our GoDaddy account secured as strongly as possible, including with two-factor authentication, CSC Domains offers a much stronger level of enterprise security and protections, and domain registrar risks are something we identified in a previous meeting of our risk committee. Unfortunately, we missed the mark on timing, because we ended up having an incident with GoDaddy prior to completing that migration. As of Monday, May 8th, 2023, we are actively underway with migrating the [Files.com](http://Files.com) domain \(and all other domains that we own\) to CSC. The transfer process, which is controlled by GoDaddy, can take up to a week to complete. Ultimately, the root cause of this incident was [Files.com](http://Files.com)’s use of GoDaddy as its domain registrar and our failure to complete the project to switch to CSC Domains in a more timely manner. We recognize the impact that this incident has had on our customers. We greatly appreciate your patience and understanding as we resolved this issue. If you need additional assistance or continue to experience issues, please contact our Customer Support team.
Status: Postmortem
Impact: Major | Started At: May 5, 2023, 7 p.m.
Description: On May 2nd, 2023, at 12:40 PM PST, [Files.com](http://Files.com) received automated alerting of elevated rates on web services which resulted in an incident being declared. The Incident Management Team \(IMT\) convened and immediately began investigation. [Files.com](http://Files.com) released an initial Status Page posting on May 2nd, 2023, at 1:11 PM PST stating: “**US Region Only: Web Service Elevated Error Rates:** _US Web services only: We are investigating elevated error rates on the web service on_ [_Files.com_](http://Files.com) _in the US region. This is causing preview delays in the web interface. This incident does not impact other network services such as API, FTP, WebDAV, AS2, and others, nor does it impact regions other than US. At this time, we believe that all network services are currently up in our other regional locations.”_ The was resolved on May 2nd, 2023 at 1:04 PM PST, returning the platform to full functionality. [Files.com](http://Files.com) released a resolution Status Page posting on May 2nd, 2023, at 1:18 PM PST stating: _“All services have been restored and are operating normally. All web services should be operating as normal. The issue with preview processing began at 12:35 PDT and was resolved completely by 1:04 PDT.”_ This incident was started when a deadlock occurred in one of [Files.com](http://Files.com)’s backend job processing systems, specifically the system that generates image and PDF previews of large images and documents for web viewing. A recent code change resulted in the system getting into a state where it locked up and did not process preview generation on 1 out of 6 backend servers. As a result of “backflow” caused by very high error rates, other jobs such as syncs were delayed by 5 minutes on two separate occasions. The root cause of this incident was a failure of [Files.com](http://Files.com)’s internal job scheduling system to probably route around the failed preview worker and prevent its failure from causing broader impact. Ultimately this was caused by a design failure internal job scheduling system, which we have now redesigned to avoid this type of issue. \(See next paragraph.\) A contributing cause was the failure of the preview worker itself, which was caused by [Files.com](http://Files.com)’s failure to properly test the recent code change in a high load situation. As a result of this incident and several other recent incidents, [Files.com](http://Files.com) worked on dramatic improvements to its internal job scheduling code during the last week of April and first week of May, and those improvements have been tested in staging and are now in production. These improvements provide multiple new protection mechanisms to prevent issues with specific customers, job types, or regions from “backflowing” and impacting other customers, job types, or regions. Extensive review and testing was conducted by [Files.com](http://Files.com) staff to ensure this resolution, and we have already taken steps internally to prevent this issue from recurring in the future. We greatly appreciate your patience and understanding as we resolved this issue. If you need additional assistance or continue to experience issues, please contact our Customer Support team.
Status: Postmortem
Impact: Major | Started At: May 2, 2023, 8:11 p.m.
Join OutLogger to be notified when any of your vendors or the components you use experience an outage or down time. Join for free - no credit card required.