Last checked: 4 minutes ago
Get notified about any outages, downtime or incidents for InfluxDB and 1800+ other cloud vendors. Monitor 10 companies, for free.
Outage and incident data over the last 30 days for InfluxDB.
Join OutLogger to be notified when any of your vendors or the components you use experience an outage. It's completely free and takes less than 2 minutes!
Sign Up NowOutlogger tracks the status of these components for Xero:
Component | Status |
---|---|
AWS: Sydney (Discontinued) | Active |
API Queries | Active |
API Writes | Active |
Compute | Active |
Other | Active |
Persistent Storage | Active |
Tasks | Active |
Web UI | Active |
Cloud Dedicated | Active |
API Reads | Active |
API Writes | Active |
Management API | Active |
Cloud Serverless: AWS, EU-Central | Active |
API Queries | Active |
API Writes | Active |
Compute | Active |
Other | Active |
Persistent Storage | Active |
Tasks | Active |
Web UI | Active |
Cloud Serverless: AWS, US-East-1 | Active |
API Queries | Active |
API Writes | Active |
Compute | Active |
Other | Active |
Persistent Storage | Active |
Tasks | Active |
Web UI | Active |
Cloud Serverless: AWS, US-West-2-1 | Active |
API Queries | Active |
API Writes | Active |
Compute | Active |
Other | Active |
Persistent Storage | Active |
Tasks | Active |
Web UI | Active |
Cloud Serverless: AWS, US-West-2-2 | Active |
API Queries | Active |
API Writes | Active |
Compute | Active |
Other | Active |
Persistent Storage | Active |
Tasks | Active |
Web UI | Active |
Cloud Serverless: Azure, East US | Active |
API Queries | Active |
API Writes | Active |
Compute | Active |
Other | Active |
Persistent Storage | Active |
Tasks | Active |
Web UI | Active |
Cloud Serverless: Azure, W. Europe | Active |
API Queries | Active |
API Writes | Active |
Compute | Active |
Other | Active |
Persistent Storage | Active |
Tasks | Active |
Web UI | Active |
Cloud Serverless: GCP | Active |
API Queries | Active |
API Writes | Active |
Compute | Active |
Other | Active |
Persistent Storage | Active |
Tasks | Active |
Web UI | Active |
Google Cloud: Belgium (Discontinued) | Active |
API Queries | Active |
API Writes | Active |
Compute | Active |
Other | Active |
Persistent Storage | Active |
Tasks | Active |
Web UI | Active |
Other Services | Active |
Auth0 User Authentication | Active |
Marketplace integrations | Active |
Web UI Authentication (Auth0) | Active |
View the latest incidents for InfluxDB and check for official updates:
Description: This incident has been resolved.
Status: Resolved
Impact: None | Started At: Jan. 31, 2024, 9:17 a.m.
Description: This incident has been resolved.
Status: Resolved
Impact: Minor | Started At: Jan. 24, 2024, 4:29 p.m.
Description: # RCA ## Query Degradation in [eu-central](https://influxdata.slack.com/archives/C06D17X5ZRT/p1710849571229689)-1 on Jan 9, 2024 ### Background Data stored in InfluxDB Cloud is distributed across 64 partitions. Distribution is performed using a persistent hash of the series key to ensure even write and query load distribution. When users write data into InfluxDB Cloud, their writes first enter a durable queue. Storage pods consume ingest data from the queue, allowing writes to be accepted even during storage issues. Time To Become Readable \(TTBR\) measures the time between a write being accepted and its data becoming available for queries. ### Summary On January 9, 2024, a single partition experienced significant increases in TTBR, causing delays in data availability for queries. CPU usage on the pods responsible for this partition rose to high levels. An investigation revealed a noisy neighbor issue caused by a small organization with infinite retention running resource-intensive queries. ### Internal Visibility of Issue Identifying the affected queries took longer than usual due to: Queries timing out in the query tier but continuing to run on storage, creating a disconnect in observed logs. The organization's small size caused it to not appear prominently in metrics. Failing queries represent a tiny proportion of the organization's usage, making shifts in query success ratios minimal. Metrics and logs relied on completed gRPC calls, which were not completing for the problematic queries. ### Cause The resource usage of a single user impacting other users, known as a noisy neighbor issue, was identified. Resources were consumed by a relatively small organization attempting to run an expensive function against all data in a dense series. Queries timing out led to continued consumption of resources, eventually leading to insufficient CPU for reliable queue consumption, thus pushing TTBR up. Mitigation Additional compute resources were deployed to mitigate any impact on the customers and allow for a smooth recovery without customer visible impact. ### Prevention Planned or ongoing changes include: Improvements to profiling for reporting usage per organization. Enhancements in visualization to facilitate easier identification of noisy neighbors.
Status: Postmortem
Impact: None | Started At: Jan. 9, 2024, 6:25 p.m.
Description: # RCA ## Query Degradation in [eu-central](https://influxdata.slack.com/archives/C06D17X5ZRT/p1710849571229689)-1 on Jan 9, 2024 ### Background Data stored in InfluxDB Cloud is distributed across 64 partitions. Distribution is performed using a persistent hash of the series key to ensure even write and query load distribution. When users write data into InfluxDB Cloud, their writes first enter a durable queue. Storage pods consume ingest data from the queue, allowing writes to be accepted even during storage issues. Time To Become Readable \(TTBR\) measures the time between a write being accepted and its data becoming available for queries. ### Summary On January 9, 2024, a single partition experienced significant increases in TTBR, causing delays in data availability for queries. CPU usage on the pods responsible for this partition rose to high levels. An investigation revealed a noisy neighbor issue caused by a small organization with infinite retention running resource-intensive queries. ### Internal Visibility of Issue Identifying the affected queries took longer than usual due to: Queries timing out in the query tier but continuing to run on storage, creating a disconnect in observed logs. The organization's small size caused it to not appear prominently in metrics. Failing queries represent a tiny proportion of the organization's usage, making shifts in query success ratios minimal. Metrics and logs relied on completed gRPC calls, which were not completing for the problematic queries. ### Cause The resource usage of a single user impacting other users, known as a noisy neighbor issue, was identified. Resources were consumed by a relatively small organization attempting to run an expensive function against all data in a dense series. Queries timing out led to continued consumption of resources, eventually leading to insufficient CPU for reliable queue consumption, thus pushing TTBR up. Mitigation Additional compute resources were deployed to mitigate any impact on the customers and allow for a smooth recovery without customer visible impact. ### Prevention Planned or ongoing changes include: Improvements to profiling for reporting usage per organization. Enhancements in visualization to facilitate easier identification of noisy neighbors.
Status: Postmortem
Impact: None | Started At: Jan. 9, 2024, 6:25 p.m.
Description: This incident has been resolved.
Status: Resolved
Impact: Minor | Started At: Dec. 7, 2023, 1:12 a.m.
Join OutLogger to be notified when any of your vendors or the components you use experience an outage or down time. Join for free - no credit card required.