Last checked: 4 minutes ago
Get notified about any outages, downtime or incidents for imgix and 1800+ other cloud vendors. Monitor 10 companies, for free.
Outage and incident data over the last 30 days for imgix.
Join OutLogger to be notified when any of your vendors or the components you use experience an outage. It's completely free and takes less than 2 minutes!
Sign Up NowOutlogger tracks the status of these components for Xero:
Component | Status |
---|---|
API Service | Active |
Docs | Active |
Purging | Active |
Rendering Infrastructure | Active |
Sandbox | Active |
Stripe API | Active |
Web Administration Tools | Active |
Content Delivery Network | Active |
Amsterdam (AMS) | Active |
Ashburn (BWI) | Active |
Ashburn (DCA) | Active |
Ashburn (IAD) | Active |
Atlanta (ATL) | Active |
Atlanta (FTY) | Active |
Atlanta (PDK) | Active |
Auckland (AKL) | Active |
Boston (BOS) | Active |
Brisbane (BNE) | Active |
Buenos Aires (EZE) | Active |
Cape Town (CPT) | Active |
Chennai (MAA) | Active |
Chicago (CHI) | Active |
Chicago (MDW) | Active |
Chicago (ORD) | Active |
Columbus (CMH) | Active |
Content Delivery Network | Active |
Copenhagen (CPH) | Active |
Curitiba (CWB) | Active |
Dallas (DAL) | Active |
Dallas (DFW) | Active |
Denver (DEN) | Active |
Dubai (FJR) | Active |
Frankfurt (FRA) | Active |
Frankfurt (HHN) | Active |
Helsinki (HEL) | Active |
Hong Kong (HKG) | Active |
Houston (IAH) | Active |
Johannesburg (JNB) | Active |
London (LCY) | Active |
London (LHR) | Active |
Los Angeles (BUR) | Active |
Los Angeles (LAX) | Active |
Madrid (MAD) | Active |
Melbourne (MEL) | Active |
Miami (MIA) | Active |
Milan (MXP) | Active |
Minneapolis (MSP) | Active |
Montreal (YUL) | Active |
Mumbai (BOM) | Active |
Newark (EWR) | Active |
New York (JFK) | Active |
New York (LGA) | Active |
Osaka (ITM) | Active |
Palo Alto (PAO) | Active |
Paris (CDG) | Active |
Perth (PER) | Active |
Rio de Janeiro (GIG) | Active |
San Jose (SJC) | Active |
Santiago (SCL) | Active |
Sāo Paulo (GRU) | Active |
Seattle (SEA) | Active |
Singapore (SIN) | Active |
Stockholm (BMA) | Active |
Sydney (SYD) | Active |
Tokyo (HND) | Active |
Tokyo (NRT) | Active |
Tokyo (TYO) | Active |
Toronto (YYZ) | Active |
Vancouver (YVR) | Active |
Wellington (WLG) | Active |
DNS | Active |
imgix DNS Network | Active |
NS1 Global DNS Network | Active |
Docs | Active |
Netlify Content Distribution Network | Active |
Netlify Origin Servers | Active |
Storage Backends | Active |
Google Cloud Storage | Active |
s3-ap-northeast-1 | Active |
s3-ap-northeast-2 | Active |
s3-ap-southeast-1 | Active |
s3-ap-southeast-2 | Active |
s3-ca-central-1 | Active |
s3-eu-central-1 | Active |
s3-eu-west-1 | Active |
s3-eu-west-2 | Active |
s3-eu-west-3 | Active |
s3-sa-east-1 | Active |
s3-us-east-2 | Active |
s3-us-standard | Active |
s3-us-west-1 | Active |
s3-us-west-2 | Active |
View the latest incidents for imgix and check for official updates:
Description: # What happened? On November 22, 2021, 17:20 UTC, the imgix service experienced disruption affecting non-cached image derivatives. A fix was pushed at 18:40 UTC, fully restoring the service by 18:50 UTC. # How were customers impacted? Between 17:00 UTC and 18:40 UTC, 6% of all requests to the imgix service returned a `503` error. During this time, errors were returned only for new derivative images which had not been cached by imgix. At 18:40 UTC, a fix was pushed out, which began restoration of the service. By 18:50 UTC, the service was marked as fully restored. # What went wrong during the incident? At the start of the outage, our team identified network behaviors that caused the initial incident. Our team then pushed configuration changes to begin the restoration of the service. Despite these mitigations, recovery stalled. We continued to investigate, however, our logs did not reveal any additional information about the root cause of the issue. This prevented us from pushing out further mitigations. We eventually traced the issue to incorrect traffic configurations for a major region. Once the issue was verified, mitigation was implemented, restoring the service. # What will imgix do to prevent this in the future? This incident exposed some gaps in our logging which may have allowed us to initiate a swifter recovery of the rendering service. As a result, we’ll be analyzing and closing gaps in our logging to prevent similar roadblocks in the future. We will also be tweaking and adding configurations to identify and automate the handling of major traffic patterns that would have otherwise affected our rendering service.
Status: Postmortem
Impact: Critical | Started At: Nov. 22, 2021, 5:24 p.m.
Description: This incident has been resolved.
Status: Resolved
Impact: Critical | Started At: Nov. 16, 2021, 5:49 p.m.
Description: This incident has been resolved.
Status: Resolved
Impact: Critical | Started At: Nov. 16, 2021, 5:49 p.m.
Description: # What happened? On September 09, 2021, between 22:08 UTC and 22:22 UTC, imgix experienced a major rendering outage affecting non-cached derivative images. # How were customers impacted? Starting at 22:08 UTC, our service began to experience an increase in rendering error rates, with requests to our rendering service receiving `502` error responses for some non-cached assets. At the short peak of the incident, 9% of requests returned an error, though this only lasted a minute before sharply dropping back to normal at 22:22 UTC. # What went wrong during the incident? At 22:03 UTC, alerts indicated that there was a connectivity issue with a service provider. Errors were still normal, though our backup servers were showing rapidly increasing network load. At the same time, there were some external issues with utilizing our database services tooling, so our team was forced to utilize other methods of investigating the sudden server downtime. While investigations were underway, our backup infrastructure had started showing signs of stress under the increasing load, which manifested as increasing error rates from our service starting at 22:08 UTC. Our engineers quickly discovered that a datacenter technician inadvertently powered off equipment during preliminary work for capacity expansion. While a failover device existed, traffic exceeded the available capacity on the device. After the change was discovered, it was quickly reversed, allowing the service to instantly recover. # What will imgix do to prevent this in the future? We will be working with our service provider to eliminate scenarios where unexpected modifications can be made to our hardware configurations, along with getting additional safeguards from our service provider to ensure we can speed up remediation in issues related to offsite infrastructure hardware. We will also be expanding our backup capacity in the near future.
Status: Postmortem
Impact: Critical | Started At: Sept. 30, 2021, 10:19 p.m.
Description: # What happened? On September 09, 2021, at 14:02 UTC, an improper configuration prevented imgix servers from connecting to some Web folder and Web Proxy origins, which caused non-cached derivative image requests for affected Web Folder / Web Proxy customer origins to return a `503` error. # How were customers impacted? The impact of this incident was isolated to some Web Folder and Web Proxy customers sharing a common configuration setting. Between the hours of 14:02 UTC and 18:56 UTC, affected Web Folder and Web Proxy customers experienced a variable increase in errors to non-cached derivative images. At the height of the incident, a small percentage of Web Folder and Web Proxy requests returned a `503` error, which amounted to 0.16% of all imgix requests. At 18:56 UTC, a fix was applied, allowing the service to be completely restored. # What went wrong during the incident? At 14:20 UTC, our team was alerted to a small increase in fetch errors to some Web Folder and Web Proxy origins. Due to the small number of errors that were reported by our monitoring service, it was unclear whether or not this was the result of some customer origins misbehaving, or if this was an issue with our service’s ability to fetch images. Eventually, our engineering team tracked down the change to a specific service provider, which we correlated to the increase in errors for some Web Folder / Web Proxy customers. As our team looked into solutions, several external factors severely slowed remediation efforts: * Our internal communication platform was experiencing connectivity issues * Some critical database services were unavailable during the incident * Service error messaging was ambiguous as to the cause of the issue * We experienced discrepancies between applied system changes and running processes Eventually, the imgix team deployed a fix that enabled our servers to successfully talk to all Web Folder and Web Proxy origins. # What will imgix do to prevent this in the future? We will be updating our configurations for fetching assets from customer origins to prevent similar issues from occurring, along with updating our service runbooks to include rolling restarts for some types of configuration updates. We will also be migrating some of our database tooling to mitigate connectivity limitations, along with updating our internal processes to address cases where communication outages occur.
Status: Postmortem
Impact: Minor | Started At: Sept. 30, 2021, 3:16 p.m.
Join OutLogger to be notified when any of your vendors or the components you use experience an outage or down time. Join for free - no credit card required.