Last checked: 24 seconds ago
Get notified about any outages, downtime or incidents for DigitalOcean and 1800+ other cloud vendors. Monitor 10 companies, for free.
Outage and incident data over the last 30 days for DigitalOcean.
Join OutLogger to be notified when any of your vendors or the components you use experience an outage. It's completely free and takes less than 2 minutes!
Sign Up NowOutlogger tracks the status of these components for Xero:
Component | Status |
---|---|
API | Active |
Billing | Active |
Cloud Control Panel | Active |
Cloud Firewall | Active |
Community | Active |
DNS | Active |
Reserved IP | Active |
Support Center | Performance Issues |
WWW | Active |
App Platform | Active |
Amsterdam | Active |
Bangalore | Active |
Frankfurt | Active |
Global | Active |
London | Active |
New York | Active |
San Francisco | Active |
Singapore | Active |
Sydney | Active |
Toronto | Active |
Container Registry | Active |
AMS3 | Active |
BLR1 | Active |
FRA1 | Active |
Global | Active |
NYC3 | Active |
SFO2 | Active |
SFO3 | Active |
SGP1 | Active |
SYD1 | Active |
Droplets | Active |
AMS2 | Active |
AMS3 | Active |
BLR1 | Active |
FRA1 | Active |
Global | Active |
LON1 | Active |
NYC1 | Active |
NYC2 | Active |
NYC3 | Active |
SFO1 | Active |
SFO2 | Active |
SFO3 | Active |
SGP1 | Active |
SYD1 | Active |
TOR1 | Active |
Event Processing | Active |
AMS2 | Active |
AMS3 | Active |
BLR1 | Active |
FRA1 | Active |
Global | Active |
LON1 | Active |
NYC1 | Active |
NYC2 | Active |
NYC3 | Active |
SFO1 | Active |
SFO2 | Active |
SFO3 | Active |
SGP1 | Active |
SYD1 | Active |
TOR1 | Active |
Functions | Active |
AMS3 | Active |
BLR1 | Active |
FRA1 | Active |
Global | Active |
LON1 | Active |
NYC1 | Active |
SFO3 | Active |
SGP1 | Active |
SYD1 | Active |
TOR1 | Active |
GPU Droplets | Active |
Global | Active |
NYC2 | Active |
TOR1 | Active |
Kubernetes | Active |
AMS3 | Active |
BLR1 | Active |
FRA1 | Active |
Global | Active |
LON1 | Active |
NYC1 | Active |
NYC3 | Active |
SFO2 | Active |
SFO3 | Active |
SGP1 | Active |
SYD1 | Active |
TOR1 | Active |
Load Balancers | Active |
AMS2 | Active |
AMS3 | Active |
BLR1 | Active |
FRA1 | Active |
Global | Active |
LON1 | Active |
NYC1 | Active |
NYC2 | Active |
NYC3 | Active |
SFO1 | Active |
SFO2 | Active |
SFO3 | Active |
SGP1 | Active |
SYD1 | Active |
TOR1 | Active |
Managed Databases | Active |
AMS3 | Active |
BLR1 | Active |
FRA1 | Active |
Global | Active |
LON1 | Active |
NYC1 | Active |
NYC2 | Active |
NYC3 | Active |
SFO2 | Active |
SFO3 | Active |
SGP1 | Active |
SYD1 | Active |
TOR1 | Active |
Monitoring | Active |
AMS2 | Active |
AMS3 | Active |
BLR1 | Active |
FRA1 | Active |
Global | Active |
LON1 | Active |
NYC1 | Active |
NYC2 | Active |
NYC3 | Active |
SFO1 | Active |
SFO2 | Active |
SFO3 | Active |
SGP1 | Active |
SYD1 | Active |
TOR1 | Active |
Networking | Active |
AMS2 | Active |
AMS3 | Active |
BLR1 | Active |
FRA1 | Active |
Global | Active |
LON1 | Active |
NYC1 | Active |
NYC2 | Active |
NYC3 | Active |
SFO1 | Active |
SFO2 | Active |
SFO3 | Active |
SGP1 | Active |
SYD1 | Active |
TOR1 | Active |
Spaces | Active |
AMS3 | Active |
BLR1 | Active |
FRA1 | Active |
Global | Active |
NYC3 | Active |
SFO2 | Active |
SFO3 | Active |
SGP1 | Active |
SYD1 | Active |
Spaces CDN | Active |
AMS3 | Active |
FRA1 | Active |
Global | Active |
NYC3 | Active |
SFO3 | Active |
SGP1 | Active |
SYD1 | Active |
Volumes | Active |
AMS2 | Active |
AMS3 | Active |
BLR1 | Active |
FRA1 | Active |
Global | Active |
LON1 | Active |
NYC1 | Active |
NYC2 | Active |
NYC3 | Active |
SFO1 | Active |
SFO2 | Active |
SFO3 | Active |
SGP1 | Active |
SYD1 | Active |
TOR1 | Active |
VPC | Active |
AMS2 | Active |
AMS3 | Active |
BLR1 | Active |
FRA1 | Active |
Global | Active |
LON1 | Active |
NYC1 | Active |
NYC2 | Active |
NYC3 | Active |
SFO1 | Active |
SFO2 | Active |
SFO3 | Active |
SGP1 | Active |
SYD1 | Active |
TOR1 | Active |
View the latest incidents for DigitalOcean and check for official updates:
Description: As of 10:45 UTC, our engineering team has resolved the issue with networking in SFO2 and SFO3 regions, and networking in the regions should now be operating normally. If you continue to experience problems, please open a ticket with our support team. We apologize for any inconvenience.
Status: Resolved
Impact: Minor | Started At: Aug. 15, 2024, 10:02 a.m.
Description: Our Engineering team has resolved the issue with Droplet creates and Snapshots. As of 05:30 UTC, users should be able to create Droplets, Snapshots and process events. Droplet backed services should also be operating normally. If you continue to experience problems, please open a ticket with our support team. We apologize for any inconvenience.
Status: Resolved
Impact: Minor | Started At: Aug. 15, 2024, 4:29 a.m.
Description: From 23:08 August 12 to 01:26 August 13 UTC, customers may have experienced failures with Droplet creation, power on events, and restore events in the NYC3 region. Our Engineering team has confirmed resolution of this issue. Thank you for your patience. If you continue to experience any problems, please open a support ticket from within your account.
Status: Resolved
Impact: Minor | Started At: Aug. 13, 2024, 12:27 a.m.
Description: From 09:52 UTC to 19:32 UTC, customers may have experienced failures with Droplet rebuild and restore events in all regions. Our Engineering team has confirmed full resolution of this issue. Thank you for your patience. If you continue to experience any problems, please open a support ticket from within your account.
Status: Resolved
Impact: Minor | Started At: Aug. 6, 2024, 7:19 p.m.
Description: ### **Incident Summary** On August 05, 2024 at 16:30 UTC, DigitalOcean experienced a disruption to internal service discovery. Customers experienced full disruption of creates, event processing, and management of other DigitalOcean products globally. Due to an error in a replication configuration that propagated globally, internal services were unable to correctly discover other services they depended on. This did not affect the availability of existing customer resources. ### **Incident Details** **Root Cause**: An incorrect replication configuration was deployed against the datastore which powers the internal service discovery service at DigitalOcean. The incorrect configuration specified a new datacenter with zero keys as 100% ownership of all keys in the datastore. This had an immediate global impact against the data storage layer and disrupted the quorum of datastore nodes across all regions. Clients of the service were unable to read/write to the datastore during this time, which had a cascading effect. **Impact**: The first observable impact was a complete disruption to the I/O layer of the backing datastore. ![](https://lh7-rt.googleusercontent.com/docsz/AD_4nXdU0ECDjKRvdaKFCHqMyxYzFgFPwuYW_vbTLkcqmh4-YHquKP77AjozHffhv0xPnUNrYXtRV0rPwUYKEtb4n0Dozo_bvHFHbIY986AIirTvtuyDRh_7Z0TYjnUEjd9p1NWB83mr9SDE2Lhz6Fh6eRB_O-LY?key=ADgogSsNR7W3UpxAWFv39Q) ![](https://lh7-rt.googleusercontent.com/docsz/AD_4nXeXhrWQUmlgO4kI7yfvDmzeVQNT0oCdDLBi53_P2TYzm8Zk0zZ0btTlKOz4zCPUvMH13byX4GTGUODvnCICI_UQGGNK5uN_YRSHE9RZ0YbQVCV4MZSKBRA1C-heSuzPN800iamOSr8XG0EYCnfK-6Y79mKE?key=ADgogSsNR7W3UpxAWFv39Q) These events are consumed by a wide variety of backing services that compose the DigitalOcean Cloud platform. This incident impacted: * Droplet Creates * Droplet Updates * Network Creates * Login / Authentication services * Block Storage Volumes Snapshot creation * Spaces/CDN Creates * Spaces Updates * Managed Kubernetes cluster creates * Managed Databases creates Other services across DigitalOcean, outside of the eventing flow, also rely on service discovery to talk to each other, so customers may have seen additional impact when attempting to manage assorted services through the Cloud Control Panel or via the API. **Response**: After gathering diagnostic information and determining the root cause, an updated / correct replication configuration was deployed. Some regions ingested the new replication configuration and started to recover. Teams identified additional regions that took longer to ingest the updated configuration and manually invoked the change directly on the nodes, and then ran local repairs on the data to ensure alignment before moving to the next region. Engineering teams cleaned up any remaining failed events and processed pending events that had not yet timed out. At the conclusion of that cleanup effort, the incident was declared resolved, and the cloud platform stabilized. ### **Timeline of Events \(UTC\)** Aug 05 16:30 - Rollout of the new datastore cluster begins. Aug 05 16:35 - First report of service discovery unavailability is raised internally. Aug 05 16:42 - Lack of quorum and datastore ownership is identified as the blocking issue. Aug 05 17:00 - The replication configuration change, adding the new datacenter, is identified as the root cause behind the ownership change. Aug 05 17:16 - The replication configuration change is reverted, and run against the region that had become the datastore owner. Some events start to fail faster at this point, changing the error from a distinct timeout to a failure to find endpoints. Aug 05 18:25 - Regions that have not detected or applied the reverted configuration are identified, and engineers start manually applying the configuration and running repairs on the datastore for those regions. Aug 05 19:10 - Remaining failure events resolve, and the platform stabilizes. ### **Remediation Actions** The replication configuration deployment happened outside of a normal maintenance window. Moving forward, these types of extension maintenances will be performed inside a declared maintenance window, with any potential for customer impact communicated via a maintenance notice posted on the status page. The process documentation for this type of deployment will be updated to reflect the current requirements and clearly outline the steps and expectations for each stage of a new deployment. Additionally, the manual processes that occurred will be automated to help reduce the potential for human error. Multiple teams are also evaluating if our current topology of the internal datastore is appropriate, and if there are any regionalizations or multi-layered approaches DigitalOcean can take to help ensure our internal service discovery remains as available as possible.
Status: Postmortem
Impact: Minor | Started At: Aug. 5, 2024, 5:05 p.m.
Join OutLogger to be notified when any of your vendors or the components you use experience an outage or down time. Join for free - no credit card required.