Last checked: 7 minutes ago
Get notified about any outages, downtime or incidents for Spreedly and 1800+ other cloud vendors. Monitor 10 companies, for free.
Outage and incident data over the last 30 days for Spreedly.
Join OutLogger to be notified when any of your vendors or the components you use experience an outage. It's completely free and takes less than 2 minutes!
Sign Up NowOutlogger tracks the status of these components for Xero:
Component | Status |
---|---|
3DS Gateway Specific | Active |
3DS Global | Active |
Core Secondary API | Active |
Core Transactional API | Active |
Express | Active |
iFrame | Active |
Payment Method Distribution (PMD) | Active |
Advanced Vault | Active |
Account Updater | Active |
Account Updater Callbacks | Active |
BIN MetaData | Active |
Fingerprinting | Active |
Network Tokenization | Active |
Alternative Payment Methods (APMs) | Active |
ACH | Active |
Apple Pay | Active |
Click to Pay (C2P) | Active |
Ebanx Local Payment Methods | Active |
Google Pay | Active |
Paypal & Offsite Payments | Active |
PPRO Local Payment Methods | Active |
Stripe Alternative Payment Methods | Active |
Gateway Integrations | Performance Issues |
Adyen | Active |
Authorize.net | Active |
Braintree | Active |
Cybersource | Performance Issues |
Elavon | Active |
Gateways | Active |
Mercado Pago | Active |
NMI | Active |
Orbital | Active |
Payflow Pro | Active |
PayPal | Active |
Stripe API | Active |
Worldpay | Active |
Supporting Services | Active |
app.spreedly.com | Active |
Docs.spreedly.com | Active |
Spreedly.com | Active |
Spreedly Dashboard | Active |
View the latest incidents for Spreedly and check for official updates:
Description: # January 5, 2023 — Intermittent 500s in Core ### January 5, 2023 — Intermittent Core 500 Errors Primarily Affecting Offsite Transactions Spreedly’s core API server receives and responds to external API requests made by our clients. ## What Happened While upgrading Spreedly’s database capabilities, code was deployed that generated a large queue of backlogged work. While working through the backlog of enqueued work, core served some customers intermittent 500 errors on January 5, 2023 between 5:30pm UTC and 8:00pm UTC. ## Next Steps Spreedly is reviewing database upgrade processes, queueing rules, and adding some additional alerting.
Status: Postmortem
Impact: Minor | Started At: Jan. 5, 2023, 6:39 p.m.
Description: ## December 30th, 2022 - Spreedly API Errors On December 30th, 2022, at approximately 18:00 UTC, our secondary index database was heavily throttled by our Database Service Provider. This resulted in intermittent 500 errors across multiple endpoints and limited access to secondary services such as Dashboard and ID. ## What Happened Spreedly presently maintains a “secondary index” database, apart from the main transaction processing database, that facilitates reporting \(dashboard\), data analytics, and some transaction flows that make use of `list` or `show` endpoints. This is done to ensure that the primary money-moving transactions \(i.e.: Vaulting, Payment Gateway, and Receiver Transactions\) are always the foremost concern. Spreedly engages a Database Service Provider for this secondary index database. After being assured that by our DB Service Provider in February 2022 that we would not run afoul of any DB sizing limitations or constraints \(“plan size is not enforced for our largest plans”, of which Spreedly subscribes to\), we nonetheless had our access disabled \(`ALTER ROLE "[REDACTED]" CONNECTION LIMIT 0`\) at 18:52 UTC on December 30, 2022. As we understand now this was an automated control implemented by our DB Service Provider; as our database naturally grew and shrunk during its course of operations, we would cross above & below this threshold \(~12.7TB\), and access would be restricted and then restored, such as it was again at 18:57 UTC. During these intervals when our connection limit disabled access, customers would be unable to use `list` or `show` commands, and dashboard access may have been impaired. Vaulting, Payment Gateway, and Receiver Transactions \(that did not rely on these endpoints as part of their transaction flow\) were otherwise not impacted. We ceased further writes to the database while we worked with the DB Service Provider to understand the issue and reclaim \(over the course of the incident\) 800GB of database storage, reducing our DB size to ~11.9TB. This resizing was performed through the optimization of our DB usage \(primarily indexes\) without the loss or deletion of any data. The resize process itself ebbed and flowed the overall DB size above the automated cut-off limit and it took a while to work with our DB Service Provider before they could hard-code us a connection limit that allowed both normal application function and our maintenance to occur. This meant that we experienced DB connectivity issues \(and their corresponding `list` & `show` customer impacts\) during the following additional times: | **START \(times UTC\)** | **STOP \(times UTC\)** | **DURATION** | | --- | --- | --- | | 2022-12-30 19:17 | 2022-12-30 19:19 | 2 minutes | | 2022-12-30 19:23 | 2022-12-30 19:25 | 2 minutes | | 2022-12-30 19:28 | 2022-12-31 01:12 | 5 hours, 43 minutes | | 2022-12-31 04:40 | 2022-12-31 05:35 | 55 minutes | | 2022-12-31 12:19 | 2022-12-3113:41 | 1 hour, 22 minutes | Note: The table above was edited to correct a factual error on January 26th, 2023. After our DB resize efforts had completed, we worked to bring the backlog of data into the database so that dashboard and data analytics once again represented current transactional data. ## Next Steps We are taking the following actions to ensure that this does not occur again: * Further optimizing DB storage and usage to reclaim additional space below any automated cutoff * Working with our DB Service Provider to remove any future limit enforcement. * Working on an emergency plan to change DB Service Providers, if needed. * Migrating our DB \(already underway\) to a new DB platform that will be self-managed. This effort will complete this calendar quarter, 2023.
Status: Postmortem
Impact: Minor | Started At: Dec. 30, 2022, 7:51 p.m.
Description: ## December 30th, 2022 - Spreedly API Errors On December 30th, 2022, at approximately 18:00 UTC, our secondary index database was heavily throttled by our Database Service Provider. This resulted in intermittent 500 errors across multiple endpoints and limited access to secondary services such as Dashboard and ID. ## What Happened Spreedly presently maintains a “secondary index” database, apart from the main transaction processing database, that facilitates reporting \(dashboard\), data analytics, and some transaction flows that make use of `list` or `show` endpoints. This is done to ensure that the primary money-moving transactions \(i.e.: Vaulting, Payment Gateway, and Receiver Transactions\) are always the foremost concern. Spreedly engages a Database Service Provider for this secondary index database. After being assured that by our DB Service Provider in February 2022 that we would not run afoul of any DB sizing limitations or constraints \(“plan size is not enforced for our largest plans”, of which Spreedly subscribes to\), we nonetheless had our access disabled \(`ALTER ROLE "[REDACTED]" CONNECTION LIMIT 0`\) at 18:52 UTC on December 30, 2022. As we understand now this was an automated control implemented by our DB Service Provider; as our database naturally grew and shrunk during its course of operations, we would cross above & below this threshold \(~12.7TB\), and access would be restricted and then restored, such as it was again at 18:57 UTC. During these intervals when our connection limit disabled access, customers would be unable to use `list` or `show` commands, and dashboard access may have been impaired. Vaulting, Payment Gateway, and Receiver Transactions \(that did not rely on these endpoints as part of their transaction flow\) were otherwise not impacted. We ceased further writes to the database while we worked with the DB Service Provider to understand the issue and reclaim \(over the course of the incident\) 800GB of database storage, reducing our DB size to ~11.9TB. This resizing was performed through the optimization of our DB usage \(primarily indexes\) without the loss or deletion of any data. The resize process itself ebbed and flowed the overall DB size above the automated cut-off limit and it took a while to work with our DB Service Provider before they could hard-code us a connection limit that allowed both normal application function and our maintenance to occur. This meant that we experienced DB connectivity issues \(and their corresponding `list` & `show` customer impacts\) during the following additional times: | **START \(times UTC\)** | **STOP \(times UTC\)** | **DURATION** | | --- | --- | --- | | 2022-12-30 19:17 | 2022-12-30 19:19 | 2 minutes | | 2022-12-30 19:23 | 2022-12-30 19:25 | 2 minutes | | 2022-12-30 19:28 | 2022-12-31 01:12 | 5 hours, 43 minutes | | 2022-12-31 04:40 | 2022-12-31 05:35 | 55 minutes | | 2022-12-31 12:19 | 2022-12-3113:41 | 1 hour, 22 minutes | Note: The table above was edited to correct a factual error on January 26th, 2023. After our DB resize efforts had completed, we worked to bring the backlog of data into the database so that dashboard and data analytics once again represented current transactional data. ## Next Steps We are taking the following actions to ensure that this does not occur again: * Further optimizing DB storage and usage to reclaim additional space below any automated cutoff * Working with our DB Service Provider to remove any future limit enforcement. * Working on an emergency plan to change DB Service Providers, if needed. * Migrating our DB \(already underway\) to a new DB platform that will be self-managed. This effort will complete this calendar quarter, 2023.
Status: Postmortem
Impact: Minor | Started At: Dec. 30, 2022, 7:51 p.m.
Description: ## Summary Routine maintenance activities triggered a previously unidentified bug that resulted in a loss of API availability. ## What Happened Spreedly leverages a “Leader/Follower” architectural pattern in its containerization models where a self-elected leader orchestrates work for other follower services. During routine systems maintenance work, a previously unidentified bug resulted in state communication loss across an orchestration cluster. This meant that over time the “leader/follower” relationships atrophied and could not be reestablished. Redeploying the cluster resulted in workers that were unable to pickup jobs from their leader, including the jobs that ran a critical component of our API infrastructure. Our systems detected the issue automatically and alerted our systems engineers who quickly restored the appropriate state/relationship for all jobs. As jobs became active, the critical componentry restarted and full API functionality resumed. No data loss resulted from this activity. ## Conclusion At Spreedly we understand' the critical role we play in our customers online experience, and deeply regret the interruption this incident caused. We have instituted immediate changes, as well as begun work on short and long term efforts, to ensure this type of issue does not recur.
Status: Postmortem
Impact: Major | Started At: Dec. 13, 2022, 8:27 p.m.
Description: The issue of delayed data reporting in the Spreedly Dashboard has been resolved. We have rerun processes that have restored data to the most up to date point in time. We apologize for any inconveniences this delay caused.
Status: Resolved
Impact: Minor | Started At: Dec. 1, 2022, 1:03 p.m.
Join OutLogger to be notified when any of your vendors or the components you use experience an outage or down time. Join for free - no credit card required.