Company Logo

Is there an InfluxDB outage?

InfluxDB status: Systems Active

Last checked: 2 minutes ago

Get notified about any outages, downtime or incidents for InfluxDB and 1800+ other cloud vendors. Monitor 10 companies, for free.

Subscribe for updates

InfluxDB outages and incidents

Outage and incident data over the last 30 days for InfluxDB.

There have been 1 outages or incidents for InfluxDB in the last 30 days.

Severity Breakdown:

Tired of searching for status updates?

Join OutLogger to be notified when any of your vendors or the components you use experience an outage. It's completely free and takes less than 2 minutes!

Sign Up Now

Components and Services Monitored for InfluxDB

Outlogger tracks the status of these components for Xero:

API Queries Active
API Writes Active
Compute Active
Other Active
Persistent Storage Active
Tasks Active
Web UI Active
API Reads Active
API Writes Active
Management API Active
API Queries Active
API Writes Active
Compute Active
Other Active
Persistent Storage Active
Tasks Active
Web UI Active
API Queries Active
API Writes Active
Compute Active
Other Active
Persistent Storage Active
Tasks Active
Web UI Active
API Queries Active
API Writes Active
Compute Active
Other Active
Persistent Storage Active
Tasks Active
Web UI Active
API Queries Active
API Writes Active
Compute Active
Other Active
Persistent Storage Active
Tasks Active
Web UI Active
API Queries Active
API Writes Active
Compute Active
Other Active
Persistent Storage Active
Tasks Active
Web UI Active
API Queries Active
API Writes Active
Compute Active
Other Active
Persistent Storage Active
Tasks Active
Web UI Active
API Queries Active
API Writes Active
Compute Active
Other Active
Persistent Storage Active
Tasks Active
Web UI Active
API Queries Active
API Writes Active
Compute Active
Other Active
Persistent Storage Active
Tasks Active
Web UI Active
Auth0 User Authentication Active
Marketplace integrations Active
Web UI Authentication (Auth0) Active
Component Status
Active
API Queries Active
API Writes Active
Compute Active
Other Active
Persistent Storage Active
Tasks Active
Web UI Active
Active
API Reads Active
API Writes Active
Management API Active
Active
API Queries Active
API Writes Active
Compute Active
Other Active
Persistent Storage Active
Tasks Active
Web UI Active
Active
API Queries Active
API Writes Active
Compute Active
Other Active
Persistent Storage Active
Tasks Active
Web UI Active
Active
API Queries Active
API Writes Active
Compute Active
Other Active
Persistent Storage Active
Tasks Active
Web UI Active
Active
API Queries Active
API Writes Active
Compute Active
Other Active
Persistent Storage Active
Tasks Active
Web UI Active
Active
API Queries Active
API Writes Active
Compute Active
Other Active
Persistent Storage Active
Tasks Active
Web UI Active
Active
API Queries Active
API Writes Active
Compute Active
Other Active
Persistent Storage Active
Tasks Active
Web UI Active
Active
API Queries Active
API Writes Active
Compute Active
Other Active
Persistent Storage Active
Tasks Active
Web UI Active
Active
API Queries Active
API Writes Active
Compute Active
Other Active
Persistent Storage Active
Tasks Active
Web UI Active
Active
Auth0 User Authentication Active
Marketplace integrations Active
Web UI Authentication (Auth0) Active

Latest InfluxDB outages and incidents.

View the latest incidents for InfluxDB and check for official updates:

Updates:

  • Time: March 1, 2023, 1:34 a.m.
    Status: Postmortem
    Update: # Incident RCA Write and read outage in AWS: Frankfurt, EU-Central-1, AWS: Oregon, US-West-2-1 and AWS: Virginia, US-East-1 # Summary On Feb 24, 2023 at 19.30 UTC, we deployed a software change to multiple production clusters, which caused a significant percentage of writes and queries to fail in our larger clusters.  The duration of the outage was different for each cluster as was the level of disruption \(percentage of writes and queries that failed during the incident\). The table below summarizes the time ranges during which the service was impacted in each cluster \(all in UTC time\). | Cluster | Write failure start | Write failure end | Query failure start | Query failure end | | --- | --- | --- | --- | --- | | prod01-us-west-2 | 19:38 | 22:17 | 19:36 | 22:20 | | prod01-eu-central-1 | 19:36 | 23:49 | 19:34 | 23:38 | | prod101-us-east-1 | 19:34 | 22:44 | 19:34 | 00:44 | # Cause of the Incident Our software is deployed via a CD pipeline to three staging clusters \(one per cloud provider\) where a suite of automated tests are run.  If those tests pass, then it is deployed into an internal cluster where another round of testing occurs, and finally it is deployed to all of our production clusters in parallel.  This is our standard software deployment methodology for our cloud service. On February 24, 2023, an engineer made a change to a health-check to ensure that our query and write pods can reach the vault within the cluster \(where credentials are managed\). In the past, it was possible for a query or write pod to get stuck, if it lost access to the vault. To address that problem, a health check was added so that if a pod could not reach the vault, the pod would stop/restart automatically. This health check was tested in all three staging clusters, and worked fine. The change was promoted to our internal cluster, which also worked fine. The change was then promoted to our production clusters.  In the larger clusters, when the pods were restarted \(with the new health check in place\) too many pods made health-check calls to the vault in quick succession.  These calls overwhelmed the vault, and it was unable to service all the requests. As the health check failed, the pods attempted to recover by restarting, which put an even heavier workload on the vault, from which it was unable to recover. # Recovery As soon as we detected the problem, and identified the offending software change, we rolled back to an earlier version of our production software, and redeployed that in all the production clusters. In our smaller clusters, this happened quickly, without any significant customer impact. In our three largest clusters \(the clusters listed above\), as the vault was deadlocked, we were unable to deploy the new software without manually restarting the vault instances, and then gradually restarting the services that depend on the vault. This is what caused it to take longer to recover in these clusters. # Future mitigations 1. We are re-implementing the offending health check so that we can detect a stuck pod without putting such a burden on the vault. 2. As the vault is a critical element of our service, we are adding an extra peer review step to all software changes that interact with the vault. 3. We are enhancing the vault configuration to have the vault more gracefully degrade when overloaded. 4. We are enhancing our runbooks so that we can more quickly intervene with manual steps if the regular deployment/rollback process fails, to reduce our overall time-to-recover when a cluster fails to recover normally.
  • Time: Feb. 25, 2023, 1:55 a.m.
    Status: Resolved
    Update: The issue has been fully resolved in all regions. We will continue to monitor.
  • Time: Feb. 25, 2023, 1:40 a.m.
    Status: Monitoring
    Update: The issue has been fully resolved in all regions. We will continue to monitor.
  • Time: Feb. 25, 2023, 1:21 a.m.
    Status: Monitoring
    Update: Write and read are back in AWS: Virginia, US-East-1 and we are continuing to monitor for any further issues.
  • Time: Feb. 25, 2023, 1:19 a.m.
    Status: Monitoring
    Update: We are continuing to monitor for any further issues.
  • Time: Feb. 25, 2023, 1:04 a.m.
    Status: Monitoring
    Update: Write and read are down in AWS: Virginia, US-East-1.
  • Time: Feb. 25, 2023, 1:02 a.m.
    Status: Monitoring
    Update: We are continuing to monitor for any further issues.
  • Time: Feb. 25, 2023, 12:32 a.m.
    Status: Monitoring
    Update: Write and read are working in all regions now.
  • Time: Feb. 24, 2023, 10:50 p.m.
    Status: Investigating
    Update: Write and read outage: AWS: Oregon, US-WEST-2-1 and AWS: Virginia, US-East-1 are recovering, and we are still working on AWS: Frankfurt, EU-Central-1
  • Time: Feb. 24, 2023, 10:44 p.m.
    Status: Investigating
    Update: We are continuing to investigate this issue.
  • Time: Feb. 24, 2023, 8 p.m.
    Status: Investigating
    Update: We are currently investigating this issue.

Updates:

  • Time: Feb. 22, 2023, 10:38 p.m.
    Status: Postmortem
    Update: # Incident RCA Queries timing Out # Summary A bulk export job was running to export a customer’s data.  As the bulk export was taking a long time, we allocated more resources to the export job.  The export job uses ephemeral disks, and runs on the same nodes as other services, such as the query nodes. As the job ran on many nodes, it consumed the ephemeral disks on the shared nodes, which impacted the other services, causing queries to fail. When we were alerted to the query failures, we stopped the bulk export job and the cluster recovered.  We will be reworking the bulk export to run on its dedicated PVC, so that it cannot impact the other services in the cluster. We are sorry for the service disruption that this caused.
  • Time: Feb. 22, 2023, 9:47 p.m.
    Status: Resolved
    Update: This incident has been resolved. Error rates are back to normal across the cluster.
  • Time: Feb. 22, 2023, 7:45 p.m.
    Status: Monitoring
    Update: A fix has been implemented and we are monitoring the results.
  • Time: Feb. 22, 2023, 7:23 p.m.
    Status: Investigating
    Update: We are currently investigating this issue.

Updates:

  • Time: Feb. 22, 2023, 10:38 p.m.
    Status: Postmortem
    Update: # Incident RCA Queries timing Out # Summary A bulk export job was running to export a customer’s data.  As the bulk export was taking a long time, we allocated more resources to the export job.  The export job uses ephemeral disks, and runs on the same nodes as other services, such as the query nodes. As the job ran on many nodes, it consumed the ephemeral disks on the shared nodes, which impacted the other services, causing queries to fail. When we were alerted to the query failures, we stopped the bulk export job and the cluster recovered.  We will be reworking the bulk export to run on its dedicated PVC, so that it cannot impact the other services in the cluster. We are sorry for the service disruption that this caused.
  • Time: Feb. 22, 2023, 9:47 p.m.
    Status: Resolved
    Update: This incident has been resolved. Error rates are back to normal across the cluster.
  • Time: Feb. 22, 2023, 7:45 p.m.
    Status: Monitoring
    Update: A fix has been implemented and we are monitoring the results.
  • Time: Feb. 22, 2023, 7:23 p.m.
    Status: Investigating
    Update: We are currently investigating this issue.

Updates:

  • Time: Feb. 7, 2023, 11:48 p.m.
    Status: Resolved
    Update: This incident has been resolved.
  • Time: Feb. 7, 2023, 11:45 p.m.
    Status: Investigating
    Update: The incident has been resolved.
  • Time: Feb. 7, 2023, 10:41 p.m.
    Status: Investigating
    Update: We are continuing to investigate this issue.
  • Time: Feb. 7, 2023, 10:39 p.m.
    Status: Investigating
    Update: Degraded query performance and tasks running late in AWS us-west-2

Updates:

  • Time: Feb. 7, 2023, 11:48 p.m.
    Status: Resolved
    Update: This incident has been resolved.
  • Time: Feb. 7, 2023, 11:45 p.m.
    Status: Investigating
    Update: The incident has been resolved.
  • Time: Feb. 7, 2023, 10:41 p.m.
    Status: Investigating
    Update: We are continuing to investigate this issue.
  • Time: Feb. 7, 2023, 10:39 p.m.
    Status: Investigating
    Update: Degraded query performance and tasks running late in AWS us-west-2

Check the status of similar companies and alternatives to InfluxDB

Smartsheet
Smartsheet

Systems Active

ESS (Public)
ESS (Public)

Systems Active

ESS (Public)
ESS (Public)

Systems Active

Cloudera
Cloudera

Systems Active

New Relic
New Relic

Systems Active

Boomi
Boomi

Systems Active

AppsFlyer
AppsFlyer

Systems Active

Imperva
Imperva

Systems Active

Bazaarvoice
Bazaarvoice

Issues Detected

Optimizely
Optimizely

Systems Active

Electric
Electric

Systems Active

ABBYY
ABBYY

Systems Active

Frequently Asked Questions - InfluxDB

Is there a InfluxDB outage?
The current status of InfluxDB is: Systems Active
Where can I find the official status page of InfluxDB?
The official status page for InfluxDB is here
How can I get notified if InfluxDB is down or experiencing an outage?
To get notified of any status changes to InfluxDB, simply sign up to OutLogger's free monitoring service. OutLogger checks the official status of InfluxDB every few minutes and will notify you of any changes. You can veiw the status of all your cloud vendors in one dashboard. Sign up here
What does InfluxDB do?
Efficiently store and access time series data in a specialized database designed for speed, available in cloud, on-premises, or edge environments.