Last checked: 4 minutes ago
Get notified about any outages, downtime or incidents for ShipHawk and 1800+ other cloud vendors. Monitor 10 companies, for free.
Outage and incident data over the last 30 days for ShipHawk.
Join OutLogger to be notified when any of your vendors or the components you use experience an outage. It's completely free and takes less than 2 minutes!
Sign Up NowOutlogger tracks the status of these components for Xero:
Component | Status |
---|---|
ShipHawk Application | Active |
ShipHawk Website | Active |
Carrier/3PL Connectors | Active |
DHL eCommerce | Active |
FedEx Web Services | Active |
LTL / Other Carrier Web Services | Active |
UPS Web Services | Active |
USPS via Endicia | Active |
USPS via Pitney Bowes | Active |
ShipHawk APIs | Active |
Shipping APIs | Active |
WMS APIs | Active |
ShipHawk Application | Active |
WMS | Active |
ShipHawk Instances | Active |
sh-default | Active |
sh-p-2 | Active |
System Connectors | Active |
Acumatica App | Active |
Amazon Web Services | Active |
Magento | Active |
Oracle NetSuite SuiteApp | Active |
Shopify App | Active |
View the latest incidents for ShipHawk and check for official updates:
Description: This incident has been resolved. A post mortem will be made available on this status page on Friday 24 June 2022. Customer impact: Some customers were unable to log into ShipHawk during this time. Start Time: 8:10am Pacific Time End Time: 9:15am Pacific Time
Status: Resolved
Impact: Minor | Started At: June 21, 2022, 3:45 p.m.
Description: ## **Incident summary** We determined the actual start to be 6:24 PM Pacific Time. The issue was reported by an affected customer at 8:02 PM Pacific Time and was resolved at 9:29 PM Pacific Time. During this incident, some customers were unable to ship. ## **Leadup** As a part of a routine database maintenance process, we planned a standard procedure for reclaiming unused disk space. The process started as planned but took more time than originally estimated when we ran this in our test environment. This eventually caused issues with the document generation processes. That, in turn, affected the ability to book new shipments, which heavily rely on new document generation. ## **Fault** The process of reclaiming unused disk space for document generation took longer than expected that eventually caused the table to be locked. Attempts to save new documents to the database failed because of this. Because document generation is a part of the shipments booking process, attempts to book new shipments failed as well. ## **Impact** Some ShipHawk users were not able to book new shipments from 6:24 PM to 9:29 PM Pacific Time. Some of the API requests related to document generation failed by timeout. ## **Detection** The incident was first detected when reported by a customer at 8:02 PM Pacific Time. ## **Response & Recovery** We responded to the incident with all possible urgency and ultimately made the necessary changes to unlock the tables and recover the service. The DevOps team made an analysis of the issue and after considering multiple options and made a decision to terminate the database optimization process and manually release the table lock. ## **Timeline** All times are in Pacific Time. **Thursday, 10 June 2022** 5:30 PM - the standard database maintenance process started 6:24 PM - the tool designed for reclaiming unused disk space acquired a lock on the table 8:02 PM - a customer reported issues with BOL generation and shipment booking 8:06 PM - the support team began investigating the reported issue 8:15 PM - the ticket was passed to the engineering team, and the DevOps engineering team started investigating 8:30 PM - the root cause was identified 9:10 PM - the DevOps team identified a way to recover the service without data loss 9:29 PM - the service was restored ## **Root cause identification: The Five Whys** 1. Document generation and shipment booking failed by timeout. 2. Because the system was not able to save newly generated documents into the database. 3. Because the documents table was locked. 4. Because the process of reclaiming unused disk space took longer than expected. 5. Because one of the database tables was too big. ## **Root cause** An existing procedure for reclaiming unused disk space does not work sufficiently for large database tables \(>2Tb\). ## **Lessons learned** * The procedure for reclaiming unused disk space should be optimized for large tables. * We need to improve monitoring for anomalies in shipping API usage, especially during routine database maintenance. ## **Corrective actions** 1. Optimize the procedure for reclaiming unused disk space for large database tables. 2. Begin monitoring anomalies in shipping API usage.
Status: Postmortem
Impact: Critical | Started At: June 10, 2022, 4:27 a.m.
Description: ## **Incident summary** We determined the actual start to be 6:24 PM Pacific Time. The issue was reported by an affected customer at 8:02 PM Pacific Time and was resolved at 9:29 PM Pacific Time. During this incident, some customers were unable to ship. ## **Leadup** As a part of a routine database maintenance process, we planned a standard procedure for reclaiming unused disk space. The process started as planned but took more time than originally estimated when we ran this in our test environment. This eventually caused issues with the document generation processes. That, in turn, affected the ability to book new shipments, which heavily rely on new document generation. ## **Fault** The process of reclaiming unused disk space for document generation took longer than expected that eventually caused the table to be locked. Attempts to save new documents to the database failed because of this. Because document generation is a part of the shipments booking process, attempts to book new shipments failed as well. ## **Impact** Some ShipHawk users were not able to book new shipments from 6:24 PM to 9:29 PM Pacific Time. Some of the API requests related to document generation failed by timeout. ## **Detection** The incident was first detected when reported by a customer at 8:02 PM Pacific Time. ## **Response & Recovery** We responded to the incident with all possible urgency and ultimately made the necessary changes to unlock the tables and recover the service. The DevOps team made an analysis of the issue and after considering multiple options and made a decision to terminate the database optimization process and manually release the table lock. ## **Timeline** All times are in Pacific Time. **Thursday, 10 June 2022** 5:30 PM - the standard database maintenance process started 6:24 PM - the tool designed for reclaiming unused disk space acquired a lock on the table 8:02 PM - a customer reported issues with BOL generation and shipment booking 8:06 PM - the support team began investigating the reported issue 8:15 PM - the ticket was passed to the engineering team, and the DevOps engineering team started investigating 8:30 PM - the root cause was identified 9:10 PM - the DevOps team identified a way to recover the service without data loss 9:29 PM - the service was restored ## **Root cause identification: The Five Whys** 1. Document generation and shipment booking failed by timeout. 2. Because the system was not able to save newly generated documents into the database. 3. Because the documents table was locked. 4. Because the process of reclaiming unused disk space took longer than expected. 5. Because one of the database tables was too big. ## **Root cause** An existing procedure for reclaiming unused disk space does not work sufficiently for large database tables \(>2Tb\). ## **Lessons learned** * The procedure for reclaiming unused disk space should be optimized for large tables. * We need to improve monitoring for anomalies in shipping API usage, especially during routine database maintenance. ## **Corrective actions** 1. Optimize the procedure for reclaiming unused disk space for large database tables. 2. Begin monitoring anomalies in shipping API usage.
Status: Postmortem
Impact: Critical | Started At: June 10, 2022, 4:27 a.m.
Description: This incident has been resolved.
Status: Resolved
Impact: Minor | Started At: April 27, 2022, 7:38 p.m.
Description: This incident has been resolved.
Status: Resolved
Impact: Minor | Started At: April 27, 2022, 7:38 p.m.
Join OutLogger to be notified when any of your vendors or the components you use experience an outage or down time. Join for free - no credit card required.