Last checked: a minute ago
Get notified about any outages, downtime or incidents for SYNAQ and 1800+ other cloud vendors. Monitor 10 companies, for free.
Outage and incident data over the last 30 days for SYNAQ.
Join OutLogger to be notified when any of your vendors or the components you use experience an outage. It's completely free and takes less than 2 minutes!
Sign Up NowOutlogger tracks the status of these components for Xero:
Component | Status |
---|---|
SYNAQ Archive | Active |
SYNAQ Branding | Active |
SYNAQ Cloud Mail | Active |
SYNAQ Continuity | Active |
SYNAQ Q Portal | Active |
SYNAQ Securemail | Active |
View the latest incidents for SYNAQ and check for official updates:
Description: **Summary and Impact to Customers** On Tuesday 22nd June 2021 from 14:48 to 28th June 2021 14:55, SYNAQ Cloud Mail experienced an intermittent, degraded mail authentication incident. The resultant impact of the event was that certain users would receive authentication pop-up messages when trying to login via HTTPS, POP3/S, IMAP/S, SMTP/S, as well as slow access to webmail. **Root Cause and Solution** On the 22nd of June 2021 at 14:48 SYNAQ Cloud Mail began to experience incoming mail delays. This delay occurred at the Zimbra MTA \(Mail Transfer Agent\) layer. Once identified, we attempted a series of fixes to resolve the incident. At 15:30 a change was made to increase the processing threads of the MTA servers from 100 to 150 to increase the amount of mail dealt with at any one time by an MTA when trying to deliver mail. The change was done in an attempt to increase the processing of the mail building up in the queue. However, this did not have the desired effect. At 16:30 a new Exim server was added to the MTA cluster \(a project that was going to take place within the next couple of weeks – as detailed below in the root cause section\), and this server was able to process mails with no delays. We ceased new mail delivery to the MTA’s, one at a time, until they cleared their queues. Thereafter, normal mail flow was restored by 16:51. On the 23rd of June 2021 at 09:09, mail delays re-occurred, coupled with a select group of users receiving authentication failure “pop-up” messages when trying to login to their mailboxes. At 09:52 debugging was performed on the Anti-Virus functions on the MTA servers, as this appeared to be where the delay was taking place. Configuration changes then were made to timeout settings and processing times to try and resolve this issue. Unfortunately, this did not have the desired effect. At 10:15 the replacement of the rest of the current Cloud Mail MTAs with new Exim servers commenced. This replacement was decided upon because the one server currently in production was processing mail without delay. This was fully completed by 20:30. At 17:14 mail delays and authentication had recovered. On the 24th of June 2021 at 09:41 a select group of users were receiving authentication pop -up messages when trying to login and experienced slow access to mail. As a result, at 10:17 we moved our focus from the MTAs to the LDAP servers as mail flow was no longer affected since the change to the new Exim servers. A data dump of the master LDAP database was performed and reloaded on all the replicas to rule out any memory page fragmentation \(performance inhibiting side-affect\) across these LDAP servers. At 13:00 file system errors were discovered on the master LDAP server. All Cloud Mail servers were adjusted to point to the secondary master. A file system check and repair was run on the primary LDAP master. While the sever was down memory and CPU resources were increased. At 13:10 mail authentication and slow access had recovered. On the 25th of June 2021 at 09:22, a select group of users were receiving authentication pop-up messages when trying to login and experiencing slow access to mail. At 10:35, TCP and connection and timeout settings adjusted on all the LDAP servers. At 12:35, connection tracking was disabled on the load balancer. This was done to ensure that if there was a particular problem with an individual LDAP replica, then the connections move seamlessly to another replica. Mail authentication and slow access then recovered. On the 26th and 27th of June we experienced no further recurrences of the issues. On the 28th of June 09:30, a select group of users were receiving authentication pop-up messages when trying to login and experienced slow access to mail. At 10:00, two new LDAP replicas were built to be added to the cluster. At 11:03, the global address list feature was turned off \(for classes of service with large domains that did not need this feature\) to try and reduce the traffic to the LDAP servers. At 13:05, we deleted 30 data sources \(external account configurations\) that were stored in LDAP but were showing errors during LDAP replication. At 14:30, the two new LDAP servers and all unique components of Cloud Mail, stores, MTAs, and proxies were pointed to their own unique set of LDAP replicas. At 14:55, mail authentication and slow access had recovered. The root cause of this event was due to a project that we initiated last year to replace standard Zimbra MTAs with custom built Exim MTAs. The purpose of this project was to vastly increase the security and delivery of clients’ mail. The initial project phase \(last year\) was to replace the outbound servers and then to do the inbound servers in July. A test inbound server was added, and this resulted in the start of the experienced issues. In addition, the replacement of all of the remaining MTAs with the new inbound servers in an attempt to resolve the issue, only exacerbated the problem. The problem that was introduced was that all native servers to Zimbra establish a persistent connection through to the LDAP servers. These new MTAs, introduced to reduce load and traffic to the LDAP servers, established short-term connections. The load balancer tried to deal with the different ways to establish connections in the same way and would overload a single LDAP server and would then proceed to affect the rest in cascading manner as the load was redistributed. To resolve this issue, two different load balancer IP addresses were configured with their own separate LDAP servers behind them. One, to manage persistent connections and the other, to manage short-term connections. Thereafter, the relevant servers were pointed to the load balancer IP that suits how they communicate and connect to LDAP. **Remediation Actions** • Two additional LDAP replicas have been built and added to the LDAP cluster. • Two different load balancer IP addresses have been configured with their own separate LDAP servers behind them. One to manage persistent connections and one to manage short-term connections. Thereafter, the relevant servers were pointed to the load balancer IP that suits how they communicate through to LDAP. • A third load balancer IP will be added to improve LDAP redundancy. This will allow store servers to attempt a new connection rather than remaining connected to an LDAP server that is no longer responding.
Status: Postmortem
Impact: Minor | Started At: June 22, 2021, 12:48 p.m.
Description: Dear Clients, The SYNAQ Securemail issue has been resolved and the service has returned to optimal functionality.
Status: Resolved
Impact: Minor | Started At: May 14, 2021, 10:42 a.m.
Description: Dear Clients, The SYNAQ Securemail issue has been resolved and the service has returned to optimal functionality.
Status: Resolved
Impact: Minor | Started At: May 14, 2021, 10:42 a.m.
Description: Summary and Impact to Customers On Wednesday 8th May 2021 from 06:00 to 12:28, SYNAQ Cloud Mail experienced a mail availability incident. The resultant impact of the event was that users were unable to authenticate and could not access the platform. Root Cause and Solution The root cause of this event was due to a scheduled change on the morning of the 8th of May from 00:00 to 06:00. The change was to improve the overall switch redundancy in the storage network. The change was unsuccessful and there was a requirement to roll back. Once the change was rolled back not all the hosts could see all the relevant storage paths. This prevented many of the VM’s from being able to start up correctly so the environment could not come up completely. To resolve this issue, the port channels between the two core switches had to be brought back up. Once this was done all communication was restored from all servers to all the relevant storage. Once this occurred all VM’s were successfully restored, and users were able to authenticate and access their mail once again. Upon further investigation, it was determined that the change to stack the Core switches to increase switch redundancy took down the existing port channels between the Core switches. Once these were brought back up the switch configuration was returned to its original state and all Cloud Mail functionality was restored. Remediation Actions • A full audit of change has been conducted and all the reasons for the change being a failure have been identified. • Changes have been made to improve our rollback plan process to ensure crucial steps are not missed going forward.
Status: Postmortem
Impact: Critical | Started At: May 8, 2021, 4:20 a.m.
Description: Summary and Impact to Customers On Wednesday 8th May 2021 from 06:00 to 12:28, SYNAQ Cloud Mail experienced a mail availability incident. The resultant impact of the event was that users were unable to authenticate and could not access the platform. Root Cause and Solution The root cause of this event was due to a scheduled change on the morning of the 8th of May from 00:00 to 06:00. The change was to improve the overall switch redundancy in the storage network. The change was unsuccessful and there was a requirement to roll back. Once the change was rolled back not all the hosts could see all the relevant storage paths. This prevented many of the VM’s from being able to start up correctly so the environment could not come up completely. To resolve this issue, the port channels between the two core switches had to be brought back up. Once this was done all communication was restored from all servers to all the relevant storage. Once this occurred all VM’s were successfully restored, and users were able to authenticate and access their mail once again. Upon further investigation, it was determined that the change to stack the Core switches to increase switch redundancy took down the existing port channels between the Core switches. Once these were brought back up the switch configuration was returned to its original state and all Cloud Mail functionality was restored. Remediation Actions • A full audit of change has been conducted and all the reasons for the change being a failure have been identified. • Changes have been made to improve our rollback plan process to ensure crucial steps are not missed going forward.
Status: Postmortem
Impact: Critical | Started At: May 8, 2021, 4:20 a.m.
Join OutLogger to be notified when any of your vendors or the components you use experience an outage or down time. Join for free - no credit card required.