Last checked: 8 minutes ago
Get notified about any outages, downtime or incidents for imgix and 1800+ other cloud vendors. Monitor 10 companies, for free.
Outage and incident data over the last 30 days for imgix.
Join OutLogger to be notified when any of your vendors or the components you use experience an outage. It's completely free and takes less than 2 minutes!
Sign Up NowOutlogger tracks the status of these components for Xero:
Component | Status |
---|---|
API Service | Active |
Docs | Active |
Purging | Active |
Rendering Infrastructure | Active |
Sandbox | Active |
Stripe API | Active |
Web Administration Tools | Active |
Content Delivery Network | Active |
Amsterdam (AMS) | Active |
Ashburn (BWI) | Active |
Ashburn (DCA) | Active |
Ashburn (IAD) | Active |
Atlanta (ATL) | Active |
Atlanta (FTY) | Active |
Atlanta (PDK) | Active |
Auckland (AKL) | Active |
Boston (BOS) | Active |
Brisbane (BNE) | Active |
Buenos Aires (EZE) | Active |
Cape Town (CPT) | Active |
Chennai (MAA) | Active |
Chicago (CHI) | Active |
Chicago (MDW) | Active |
Chicago (ORD) | Active |
Columbus (CMH) | Active |
Content Delivery Network | Active |
Copenhagen (CPH) | Active |
Curitiba (CWB) | Active |
Dallas (DAL) | Active |
Dallas (DFW) | Active |
Denver (DEN) | Active |
Dubai (FJR) | Active |
Frankfurt (FRA) | Active |
Frankfurt (HHN) | Active |
Helsinki (HEL) | Active |
Hong Kong (HKG) | Active |
Houston (IAH) | Active |
Johannesburg (JNB) | Active |
London (LCY) | Active |
London (LHR) | Active |
Los Angeles (BUR) | Active |
Los Angeles (LAX) | Active |
Madrid (MAD) | Active |
Melbourne (MEL) | Active |
Miami (MIA) | Active |
Milan (MXP) | Active |
Minneapolis (MSP) | Active |
Montreal (YUL) | Active |
Mumbai (BOM) | Active |
Newark (EWR) | Active |
New York (JFK) | Active |
New York (LGA) | Active |
Osaka (ITM) | Active |
Palo Alto (PAO) | Active |
Paris (CDG) | Active |
Perth (PER) | Active |
Rio de Janeiro (GIG) | Active |
San Jose (SJC) | Active |
Santiago (SCL) | Active |
Sāo Paulo (GRU) | Active |
Seattle (SEA) | Active |
Singapore (SIN) | Active |
Stockholm (BMA) | Active |
Sydney (SYD) | Active |
Tokyo (HND) | Active |
Tokyo (NRT) | Active |
Tokyo (TYO) | Active |
Toronto (YYZ) | Active |
Vancouver (YVR) | Active |
Wellington (WLG) | Active |
DNS | Active |
imgix DNS Network | Active |
NS1 Global DNS Network | Active |
Docs | Active |
Netlify Content Distribution Network | Active |
Netlify Origin Servers | Active |
Storage Backends | Active |
Google Cloud Storage | Active |
s3-ap-northeast-1 | Active |
s3-ap-northeast-2 | Active |
s3-ap-southeast-1 | Active |
s3-ap-southeast-2 | Active |
s3-ca-central-1 | Active |
s3-eu-central-1 | Active |
s3-eu-west-1 | Active |
s3-eu-west-2 | Active |
s3-eu-west-3 | Active |
s3-sa-east-1 | Active |
s3-us-east-2 | Active |
s3-us-standard | Active |
s3-us-west-1 | Active |
s3-us-west-2 | Active |
View the latest incidents for imgix and check for official updates:
Description: # What happened? On August 26, 2021, at 15:00 UTC, the imgix service experienced disruption caused by long-running processes within our origin cache. Once our engineers identified the issue, remediation changes were applied at 15:09 UTC. After the changes were pushed out, the service sharply recovered at 15:20 UTC. # How were customers impacted? Starting at 15:00 UTC, requests to non-cached derivative images returned a `503` response. These errors accounted for about 5% of all requests to the rendering service and were sustained until 15:20 UTC when the service sharply recovered. # What went wrong during the incident? Investigating the cause of the incident, our engineers identified a scenario in which origin connections were misbehaving due to customer configuration settings. While by itself this is not normally a problem, there was some origin activity that had caused the performance of the origin cache to severely degrade, eventually affecting rendering. # What will imgix do to prevent this in the future? We will be modifying our infrastructure’s configuration to eliminate scenarios where customer configurations are able to cause origin connection issues in our infrastructure. We will also be working with existing customers to optimize their configurations so that they will not be affected by the new changes in our infrastructure.
Status: Postmortem
Impact: Major | Started At: Aug. 26, 2021, 3:09 p.m.
Description: # What happened? On August 19, 2021, 15:45 UTC images had disappeared in customer instances of the Image Manager. This incident was marked fully resolved about an hour later at 16:33 UTC. After the initial incident had been resolved and images had been restored, the issue resurfaced 4 days later on August 23, 2021, 16:24 UTC. During the second incident, the first attempt at recovery had duplicated images in the Image Manager. This incident lasted for a very brief moment and was resolved 8 minutes later at 16:32 UTC. # How were customers impacted? In the first incident, images that previously had shown in our Image Manager had disappeared. If users opened up the Image Manager user interface or had called Image Manager's list assets endpoint, they would not find any images. In the second incident, images that previously had shown in our Image Manager had duplicated themselves. The end result is that there were 2 copies of every image in the Image Manager. Since imgix does not host images, this issue only affected interactions with the user interface and the Image Manager API. No data or images were lost during this incident; it was a display-only issue. Origin images continued to be stored at customer origins. # What went wrong during the incident? On August 19, 15:45 UTC, imgix identified that at least some customers were seeing their Image Manager without any images. The issue was escalated, and our team began investigating the cause. Eventually, the issue was traced to a bad image index that had cleared out the Image Manager state for customers. After identifying the problem, our engineers recovered a previous image index, restoring images to the customer Image Manager instances. On August 23, 16:24 UTC, the incident resurfaced. The same fix was applied using the same tooling, though due to improper configurations, it had caused every image in the Image Manager to be duplicated. After re-configuring the tooling, it was executed again, restoring the Image Manager for customers. # What will imgix do to prevent this in the future? We will be developing tooling to monitor the health of the Image Manager, along with improving internal documentation regarding Image Manager remediation. We will also be tuning image indexing to eliminate and handle the conditions that had caused image indexes to function incorrectly.
Status: Postmortem
Impact: Major | Started At: Aug. 19, 2021, 4:05 p.m.
Description: # What happened? On August 19, 2021, 15:45 UTC images had disappeared in customer instances of the Image Manager. This incident was marked fully resolved about an hour later at 16:33 UTC. After the initial incident had been resolved and images had been restored, the issue resurfaced 4 days later on August 23, 2021, 16:24 UTC. During the second incident, the first attempt at recovery had duplicated images in the Image Manager. This incident lasted for a very brief moment and was resolved 8 minutes later at 16:32 UTC. # How were customers impacted? In the first incident, images that previously had shown in our Image Manager had disappeared. If users opened up the Image Manager user interface or had called Image Manager's list assets endpoint, they would not find any images. In the second incident, images that previously had shown in our Image Manager had duplicated themselves. The end result is that there were 2 copies of every image in the Image Manager. Since imgix does not host images, this issue only affected interactions with the user interface and the Image Manager API. No data or images were lost during this incident; it was a display-only issue. Origin images continued to be stored at customer origins. # What went wrong during the incident? On August 19, 15:45 UTC, imgix identified that at least some customers were seeing their Image Manager without any images. The issue was escalated, and our team began investigating the cause. Eventually, the issue was traced to a bad image index that had cleared out the Image Manager state for customers. After identifying the problem, our engineers recovered a previous image index, restoring images to the customer Image Manager instances. On August 23, 16:24 UTC, the incident resurfaced. The same fix was applied using the same tooling, though due to improper configurations, it had caused every image in the Image Manager to be duplicated. After re-configuring the tooling, it was executed again, restoring the Image Manager for customers. # What will imgix do to prevent this in the future? We will be developing tooling to monitor the health of the Image Manager, along with improving internal documentation regarding Image Manager remediation. We will also be tuning image indexing to eliminate and handle the conditions that had caused image indexes to function incorrectly.
Status: Postmortem
Impact: Major | Started At: Aug. 19, 2021, 4:05 p.m.
Description: # What happened? On August 12, 2021 between the hours of 14:10 UTC and 14:37 UTC, our rendering API experienced significant rendering errors for non-cached derivative images. The issue was identified and a fix was implemented by 14:37 UTC. Non-user-affecting behavior continued to be investigated until 15:58 UTC, which was when the incident was marked as fully resolved. # How were customers impacted? On August 12 between 14:10 UTC and 14:37 UTC, a significant amount of requests to non-cached derivative images returned 503 errors. At the peak of the incident \(14:21 UTC\), 11.59% of all requests returned an error. A fix began being implemented at 14:37 UTC with error rates at 6.11% and was fully rolled out by 15:58 UTC. Errors had returned to completely normal rates after the time of the fix. Internal investigation of background processes, which did not affect users, continued until 17:44 UTC. The incident was fully resolved at 17:55 UTC. # What went wrong during the incident? Our engineers were alerted to an increased amount of elevated error responses from an internal service in our infrastructure. Investigating the issue, our engineers identified a spike in traffic from one internal service to another, which dramatically increased memory and thread usage. This eventually affected the rendering service by preventing serving of uncached image renders. After we identified the internal service affecting the rendering stack, it was temporarily paused to reduce load. This helped reduce the number of requests and assisted with faster recovery of the service. # What will imgix do to prevent this in the future? We will evaluate our current workflow for general mitigation and preventative measures. Adjustments will be made to our infrastructure to reject new connections and to increase our internal service capacities. Tooling limits will be evaluated to see what fallback measures can be taken when some internal services reach maximum capacity.
Status: Postmortem
Impact: Minor | Started At: Aug. 12, 2021, 2:19 p.m.
Description: # What happened? On August 12, 2021 between the hours of 14:10 UTC and 14:37 UTC, our rendering API experienced significant rendering errors for non-cached derivative images. The issue was identified and a fix was implemented by 14:37 UTC. Non-user-affecting behavior continued to be investigated until 15:58 UTC, which was when the incident was marked as fully resolved. # How were customers impacted? On August 12 between 14:10 UTC and 14:37 UTC, a significant amount of requests to non-cached derivative images returned 503 errors. At the peak of the incident \(14:21 UTC\), 11.59% of all requests returned an error. A fix began being implemented at 14:37 UTC with error rates at 6.11% and was fully rolled out by 15:58 UTC. Errors had returned to completely normal rates after the time of the fix. Internal investigation of background processes, which did not affect users, continued until 17:44 UTC. The incident was fully resolved at 17:55 UTC. # What went wrong during the incident? Our engineers were alerted to an increased amount of elevated error responses from an internal service in our infrastructure. Investigating the issue, our engineers identified a spike in traffic from one internal service to another, which dramatically increased memory and thread usage. This eventually affected the rendering service by preventing serving of uncached image renders. After we identified the internal service affecting the rendering stack, it was temporarily paused to reduce load. This helped reduce the number of requests and assisted with faster recovery of the service. # What will imgix do to prevent this in the future? We will evaluate our current workflow for general mitigation and preventative measures. Adjustments will be made to our infrastructure to reject new connections and to increase our internal service capacities. Tooling limits will be evaluated to see what fallback measures can be taken when some internal services reach maximum capacity.
Status: Postmortem
Impact: Minor | Started At: Aug. 12, 2021, 2:19 p.m.
Join OutLogger to be notified when any of your vendors or the components you use experience an outage or down time. Join for free - no credit card required.