Company Logo

Is there an Mezmo outage?

Mezmo status: Systems Active

Last checked: 7 minutes ago

Get notified about any outages, downtime or incidents for Mezmo and 1800+ other cloud vendors. Monitor 10 companies, for free.

Subscribe for updates

Mezmo outages and incidents

Outage and incident data over the last 30 days for Mezmo.

There have been 1 outages or incidents for Mezmo in the last 30 days.

Severity Breakdown:

Tired of searching for status updates?

Join OutLogger to be notified when any of your vendors or the components you use experience an outage. It's completely free and takes less than 2 minutes!

Sign Up Now

Components and Services Monitored for Mezmo

Outlogger tracks the status of these components for Xero:

Alerting Active
Archiving Active
Livetail Active
Log Ingestion (Agent/REST API/Code Libraries) Active
Log Ingestion (Heroku) Active
Log Ingestion (Syslog) Active
Search Active
Web App Active
Destinations Active
Ingestion / Sources Active
Processors Active
Web App Active
Component Status
Active
Alerting Active
Archiving Active
Livetail Active
Log Ingestion (Agent/REST API/Code Libraries) Active
Log Ingestion (Heroku) Active
Log Ingestion (Syslog) Active
Search Active
Web App Active
Active
Destinations Active
Ingestion / Sources Active
Processors Active
Web App Active

Latest Mezmo outages and incidents.

View the latest incidents for Mezmo and check for official updates:

Updates:

  • Time: Oct. 26, 2024, 8:23 p.m.
    Status: Resolved
    Update: The Pipeline UI is now fully functional.
  • Time: Oct. 26, 2024, 6:34 p.m.
    Status: Monitoring
    Update: The Pipeline WebUI is available and loading pages normally. We're working to resolve the root cause permanently and continuing to monitor.
  • Time: Oct. 26, 2024, 5:24 a.m.
    Status: Monitoring
    Update: The Pipeline WebUI is available, but at times pages are slow to load and metrics may be unavailable. We are taking remedial action and continuing to monitor.
  • Time: Oct. 26, 2024, 3:35 a.m.
    Status: Monitoring
    Update: We are continuing to monitor for any further issues.
  • Time: Oct. 26, 2024, 3:35 a.m.
    Status: Monitoring
    Update: The Pipeline WebUI is working now. We are monitoring.
  • Time: Oct. 26, 2024, 2:12 a.m.
    Status: Investigating
    Update: The Pipeline WebUI is still unavailable. Ingress and egress are unaffected. Our engineers are investigating.
  • Time: Oct. 26, 2024, 1 a.m.
    Status: Investigating
    Update: Our Pipeline WebUI is not loading pages. We are investigating.

Updates:

  • Time: Dec. 4, 2023, 1:46 p.m.
    Status: Postmortem
    Update: **Dates:**  Start Time: Monday, December 4, 2023, at 10:29 UTC End Time: Monday, December 4, 2023, at 12:01 UTC Duration: 92 minutes ‌ **What happened:** Web UI users were logged out frequently – usually within 1-2 minutes of logging in. Users could successfully login again without any issues, but the session would expire shortly afterwards. ‌ **Why it happened:** It was identified that both Web UI pods and the Redis database pods, which are responsible for storing user sessions, experienced a critical memory shortage, leading to uncontrolled data purging. When this same issue happened in July 2023, our engineering team deployed a fix that enhanced how Redis stores the user session keys. This fix successfully prevented any recurrence of the problem until today. The team is still determining what made it exceed the memory limit this time. ‌ **How we fixed it:** Initially, the Web UI pods were restarted, but that did not resolve the problem permanently. The engineering team then restarted the Redis database pods and the session stopped expiring. ‌ **What we are doing to prevent it from happening again:** The team will revise the previous fix, including implementing a mechanism for the pod to automatically restart upon reaching its limit and setting up alerts to notify an engineer when it's approaching that threshold.
  • Time: Dec. 4, 2023, 1:19 p.m.
    Status: Resolved
    Update: The issue has been resolved, and no further issues have been observed with user sessions.
  • Time: Dec. 4, 2023, 12:13 p.m.
    Status: Monitoring
    Update: We have implemented a fix for the user session timeouts on the Web UI, but will continue to monitor the situation closely.
  • Time: Dec. 4, 2023, 12:06 p.m.
    Status: Investigating
    Update: The Web UI is currently encountering user session timeouts, prompting customers to log in every 1-2 minutes. Our team is actively investigating the root cause of this issue, while the remaining aspects of the service remain fully functional.

Updates:

  • Time: Sept. 6, 2023, 12:38 a.m.
    Status: Postmortem
    Update: **Dates:** Start Time: 8:32 pm UTC, Tuesday August 29th, 2023 End Time: 10:04 pm UTC, Tuesday August 29th, 2023 Duration: 92 minutes ‌ **What happened:** Our Kong Gateway service stopped functioning and all connection requests to our ingestion service and web service failed. The Web UI did not load and log lines could not be sent by either our agent or API. Log lines sent using syslog were unaffected. Kong was unavailable for two periods of time: one lasting 27 minutes \(8:32 pm UTC to 8:59 pm UTC\) and another lasting 9 minutes \(9:43 pm UTC to 9:52 pm UTC\). Once Kong became available, the Web UI was immediately accessible again. Agents resent locally cached log lines \(as did any APIs implemented with retry strategies\). Our service then processed the backlog of log lines, passing them to downstream services such as alerting, live tail, archiving, and indexing \(which makes lines visible in the Web UI for searching, graphing, and timelines\). The extra processing was completed ~20 minutes after Kong returned to normal usage the first time, and ~10 minutes after the second time. ‌ **Why it happened:** The pods running our Kong Gateway were overwhelmed with connection requests. CPU increased to a point that health checks started to fail and the pods were shut down. We’ve determined through research and experimentation that the cause was a sudden, brief increase in the volume of traffic directed to our service. Our service is designed to handle increases in traffic, but these were approximately 100 times above normal usage. The source\(s\) of the traffic are unknown. The increase came in two spikes, which correspond to the two periods when Kong became unavailable. ‌ **How we fixed it:** We manually scaled up the number of pods devoted to running our Kong Gateway. During the first spike of traffic, we doubled the number of pods; during the second, we quadrupled the number. This certainly helped speed up the processing of the backlog of log lines sent by agents once Kong was again available. It’s unclear whether the higher number of pods would have been able to process the spikes of traffic as they were happening. ‌ **What we are doing to prevent it from happening again:** We are running our Kong service with more pods so there are more resources to handle any similar spikes in traffic. We will add auto-scaling to the Kong service so more pods are made available automatically as needed. We’ll also add metrics to identify the origin of any similar spikes in traffic.
  • Time: Aug. 30, 2023, 12:17 a.m.
    Status: Resolved
    Update: This incident has been resolved.
  • Time: Aug. 29, 2023, 9:15 p.m.
    Status: Investigating
    Update: The webUI is loading consistently now, but we are still investigating.
  • Time: Aug. 29, 2023, 9:01 p.m.
    Status: Investigating
    Update: Our WebUI is not loading pages consistently. We are investigating. [Reference #3204]

Updates:

  • Time: June 28, 2023, 6:30 p.m.
    Status: Postmortem
    Update: **Dates:** Start Time: Monday, June 19, 2023, at 10:31 UTC End Time: Monday, June 19, 2023, at 12:35 UTC Duration: 124 minutes **What happened:** Users were being logged out of our WebUI frequently – within 1-2 minutes of logging in. Users could successfully login again, but the new session would also expire quickly. **Why it happened:** The cache of logged in users held in our Redis database was being cleared every 1-2 minutes. This caused all user sessions to expire and new logins to be required. We have yet to ascertain why the cache was being periodically cleared at frequent intervals. **How we fixed it:** We restarted the pods running the Redis database and the cache behavior returned to normal. **What we are doing to prevent it from happening again:** We will investigate further to learn why the Redis cache was being frequently cleared.
  • Time: June 19, 2023, 1:58 p.m.
    Status: Resolved
    Update: This incident has been resolved.
  • Time: June 19, 2023, 11:27 a.m.
    Status: Monitoring
    Update: The fix was implemented and we are now monitoring the user login sessions.
  • Time: June 19, 2023, 11:13 a.m.
    Status: Identified
    Update: The issue has been identified, and a fix is being implemented.
  • Time: June 19, 2023, 11:09 a.m.
    Status: Investigating
    Update: User sessions to our Web UI are timing out and customers using the UI have to log in every 1-2 minutes. We are investigating why this is happening, but the rest of the service is fully functional. No other components are affected.
  • Time: June 19, 2023, 11:09 a.m.
    Status: Investigating
    Update: User sessions to our Web UI are timing out and customers using the UI have to log in every 1-2 minutes. We are investigating why this is happening, but the rest of the service is fully functional. No other components are affected.

Updates:

  • Time: May 8, 2023, 7:09 p.m.
    Status: Postmortem
    Update: **Dates:** Start Time: Monday, May 1, 2023, at 19:55 UTC End Time: Monday, May 1, 2023, at 20:11 UTC Duration: 16 minutes ‌ **What happened:** The WebUI was unresponsive, returning an error of “failure to get a peer from the ring-balancer.” **Why it happened:** All Mezmo services run within a service mesh. The portion of the mesh dedicated to the pods running our Mongo database began receiving many connection requests, more than its allocated CPU and memory could handle at once. This portion of the mesh \(which itself runs on pods\) quickly ran out of memory. This made the Mongo database unavailable to other services. The WebUI relies entirely on Mongo for account information and therefore became unresponsive, returning an error of “failure to get a peer from the ring-balancer.” While the immediate reason for the incident is clear, the root cause is still unknown. We suspect there was a change in user usage patterns \(e.g. increased traffic, login attempts, etc\) which triggered the incident. **How we fixed it:** We removed the WebUI from the service mesh. The Mongo service has more CPU and memory resources allocated to it and was able to accept the high level of connection requests successfully. WebUI usage immediately returned to normal. **What we are doing to prevent it from happening again:** We will change the default settings for the service mesh to allocate more CPU and memory resources, permanently. Afterwards, we will add the Mongo service back to the service mesh.
  • Time: May 1, 2023, 8:27 p.m.
    Status: Resolved
    Update: This incident has been resolved.
  • Time: May 1, 2023, 8:18 p.m.
    Status: Identified
    Update: The Web UI is not accessible.

Check the status of similar companies and alternatives to Mezmo

Hudl
Hudl

Systems Active

OutSystems
OutSystems

Systems Active

Postman
Postman

Systems Active

Mendix
Mendix

Systems Active

DigitalOcean
DigitalOcean

Issues Detected

Bandwidth
Bandwidth

Issues Detected

DataRobot
DataRobot

Systems Active

Grafana Cloud
Grafana Cloud

Systems Active

SmartBear Software
SmartBear Software

Systems Active

Test IO
Test IO

Systems Active

Copado Solutions
Copado Solutions

Systems Active

CircleCI
CircleCI

Systems Active

Frequently Asked Questions - Mezmo

Is there a Mezmo outage?
The current status of Mezmo is: Systems Active
Where can I find the official status page of Mezmo?
The official status page for Mezmo is here
How can I get notified if Mezmo is down or experiencing an outage?
To get notified of any status changes to Mezmo, simply sign up to OutLogger's free monitoring service. OutLogger checks the official status of Mezmo every few minutes and will notify you of any changes. You can veiw the status of all your cloud vendors in one dashboard. Sign up here
What does Mezmo do?
Mezmo is a cloud-based tool that helps application owners manage and analyze important business data across different areas.