Last checked: 2 minutes ago
Get notified about any outages, downtime or incidents for Spreedly and 1800+ other cloud vendors. Monitor 10 companies, for free.
Outage and incident data over the last 30 days for Spreedly.
Join OutLogger to be notified when any of your vendors or the components you use experience an outage. It's completely free and takes less than 2 minutes!
Sign Up NowOutlogger tracks the status of these components for Xero:
Component | Status |
---|---|
3DS Gateway Specific | Active |
3DS Global | Active |
Core Secondary API | Active |
Core Transactional API | Active |
Express | Active |
iFrame | Active |
Payment Method Distribution (PMD) | Active |
Advanced Vault | Active |
Account Updater | Active |
Account Updater Callbacks | Active |
BIN MetaData | Active |
Fingerprinting | Active |
Network Tokenization | Active |
Alternative Payment Methods (APMs) | Active |
ACH | Active |
Apple Pay | Active |
Click to Pay (C2P) | Active |
Ebanx Local Payment Methods | Active |
Google Pay | Active |
Paypal & Offsite Payments | Active |
PPRO Local Payment Methods | Active |
Stripe Alternative Payment Methods | Active |
Gateway Integrations | Performance Issues |
Adyen | Active |
Authorize.net | Active |
Braintree | Active |
Cybersource | Performance Issues |
Elavon | Active |
Gateways | Active |
Mercado Pago | Active |
NMI | Active |
Orbital | Active |
Payflow Pro | Active |
PayPal | Active |
Stripe API | Active |
Worldpay | Active |
Supporting Services | Active |
app.spreedly.com | Active |
Docs.spreedly.com | Active |
Spreedly.com | Active |
Spreedly Dashboard | Active |
View the latest incidents for Spreedly and check for official updates:
Description: ## Summary A planned operating system upgrade on July 12th resulted in slightly higher ongoing CPU usage than the prior OS. This compounded so that by July 19th, we began to exhaust CPU credits, which reduced our ability to successfully process a number of transactions. ## What Happened As prologue, on Wednesday July 12th, we exercised a planned-for operating system upgrade on the servers running the cryptography service. This involved our usual and practiced behavior of gracefully moving traffic from one cluster of servers \(i.e.: “blue” to “green”\), patching the offline cluster and then performing the same in reverse. Testing illustrated that the upgrade was a success and the system continued processing ~100m\+ internal service API requests daily without issue or incident. On July 18th at 18:26 UTC, internal monitoring alerted on an increased number of failures being generated from our cryptography service. Teams were immediately assembled to work the issue. Logs indicated a resource contention `(limiting connections by zone "default"`\) as individual service nodes exceeded the configured limit of simultaneous connections. We began the effort to alter the configuration of these limits and the task of quieting one cluster, effecting the repair, bringing the cluster online, and then performing the steps for the other cluster. Unfortunately, once reconfigured, a different resource contention was found, this time at the operating system level. We began the effort to alter that configuration and percolate the change through both “blue” and “green” clusters but the errors merely moved to another type of resource constraint. We reverted potentially relevant changes from earlier in the day, without positive effect on the system. Finally, we reverted the operating system upgrade from the July 12th activity which ultimately resolved the issue. By July 19th at 00:29 UTC, the error rate on all cryptography service nodes returned to zero and we left the incident in a monitor state for approximately 12 hours before declaring the incident resolved. A subsequent root cause analysis has determined that we overran the “CPU Credits” allocated to our AWS EC2 instance type on which the cryptography service was running. We believe that the OS upgrade incrementally consumed enough CPU capacity to eventually exhaust CPU credits, as credits are utilized at relatively small CPU usage, in excess of 20% ## Next Steps We are re-provisioning this service to use an AWS EC2 instance type that has no CPU credit limits which should prevent the problem from happening again. ## Conclusion We want to apologize to all customers who were impacted by failed transactions due to this incident. We understand how much you rely on our systems being fully operational in order to successfully operate your business. We also appreciate your patience while this issue was being investigated and resolved, thank you again for the trust you place on us every day.
Status: Postmortem
Impact: Major | Started At: July 18, 2023, 7:38 p.m.
Description: After a period of monitoring, we have confirmed that the issue has been resolved.
Status: Resolved
Impact: Minor | Started At: June 29, 2023, 5:22 p.m.
Description: After a period of monitoring, we have confirmed that the issue has been resolved.
Status: Resolved
Impact: Minor | Started At: June 29, 2023, 5:22 p.m.
Description: We have received confirmation from Mastercard that a service disruption, which was identified for their Smart Interface and caused customers to experience intermittent transaction failures during the impact period, has now been restored. We appreciate your patience while we tracked this issue with our partners.
Status: Resolved
Impact: Minor | Started At: June 12, 2023, 3:04 p.m.
Description: After deploying a fix, our system is now stabilized and functioning. This incident is being considered resolved. We apologize for any inconvenience and disruption to service this caused to our customers transacting via Stripe.
Status: Resolved
Impact: Minor | Started At: June 7, 2023, 5:32 p.m.
Join OutLogger to be notified when any of your vendors or the components you use experience an outage or down time. Join for free - no credit card required.