Last checked: 32 seconds ago
Get notified about any outages, downtime or incidents for Zonos and 1800+ other cloud vendors. Monitor 10 companies, for free.
Outage and incident data over the last 30 days for Zonos.
Join OutLogger to be notified when any of your vendors or the components you use experience an outage. It's completely free and takes less than 2 minutes!
Sign Up NowOutlogger tracks the status of these components for Xero:
Component | Status |
---|---|
Classify | Active |
Dashboard | Active |
International Checkout | Active |
Quoter | Active |
Landed Cost | Active |
Landed Cost API | Active |
Landed Cost API (GraphQL) | Active |
Landed Cost API (Legacy) | Active |
Plugins | Active |
BigCommerce Duty Tax | Active |
Magento Duty Tax | Active |
Salesforce Duty Tax | Active |
Shopify Checkout | Active |
Shopify Duty Tax | Active |
View the latest incidents for Zonos and check for official updates:
Description: **What products were affected and what was the impact?** All Zonos GraphQL services. Impact: CRITICAL **What timeframe did this issue occur?** | **Date** | **Time** | | --- | --- | | Mar 31, 2023 | Starting at 18:00 MDT | | Apr 1, 2023 | Ending at 12:45 MDT | **How was the issue detected?** On the morning of April 1, Shopify GraphQL customers began noticing issues with landed cost quotes and notified CS, who then escalated the issue to the Engineering team. **What functionality was affected?** All GraphQL services in the Zonos Cloud were impacted. **What problems did this cause?** Merchants on GraphQL were unable to receive shipment ratings and landed cost quotes. **What was the resolution of the problem and steps that are being taken for continued follow-up?** After being notified of the issue, we worked quickly to switch GraphQL merchants over to our REST endpoints, which were not experiencing any issues. We then identified the root cause of the issue with GraphQL: a code deployment that caused broke event serialization and caused synchronous events to fail. A weakness with synchronous event handling then caused the event failure to cascade to the cluster-level. We immediately released a fix to prevent future occurrences. **What mitigation solutions will we put in place to prevent this issue from occurring in the future?** Our monitoring and notification channels for production server clusters were focused on unhealthy target groups and container failures. Due to the nature of the failure, we didn't receive notifications for either. This is a clear gap in monitoring coverage at a cluster-wide level. To make sure this never happens again, we are configuring task-based monitoring outside of the clusters where we will: * query each service in the cluster directly for the minimum amount of tasks that should be running and the actual number of tasks that are running, * make mock requests to each service to make sure they are returning correct responses, and * direct these notifications to our alerting platform with "on-call" rotations to make sure there are no lapses in coverage. We have also improved the resiliency of our event system, such that even if there were a future issue with event serialization, it would have no effect upon our public GraphQL services.
Status: Postmortem
Impact: Critical | Started At: April 1, 2023, 6:27 p.m.
Description: There was a problem with the DNS on a few of the Landed Cost API servers causing a partial outage. The problem has been identified and resolved.
Status: Resolved
Impact: Minor | Started At: Feb. 20, 2023, 9:36 p.m.
Description: There was a problem with the DNS on a few of the Landed Cost API servers causing a partial outage. The problem has been identified and resolved.
Status: Resolved
Impact: Minor | Started At: Feb. 20, 2023, 9:36 p.m.
Description: There was a problem with the DNS on a few of the Landed Cost API servers causing a partial outage. The problem has been identified and resolved.
Status: Resolved
Impact: Minor | Started At: Feb. 20, 2023, 9:16 p.m.
Description: There was a problem with the DNS on a few of the Landed Cost API servers causing a partial outage. The problem has been identified and resolved.
Status: Resolved
Impact: Minor | Started At: Feb. 20, 2023, 9:16 p.m.
Join OutLogger to be notified when any of your vendors or the components you use experience an outage or down time. Join for free - no credit card required.