Company Logo

Is there an Mezmo outage?

Mezmo status: Systems Active

Last checked: 9 minutes ago

Get notified about any outages, downtime or incidents for Mezmo and 1800+ other cloud vendors. Monitor 10 companies, for free.

Subscribe for updates

Mezmo outages and incidents

Outage and incident data over the last 30 days for Mezmo.

There have been 1 outages or incidents for Mezmo in the last 30 days.

Severity Breakdown:

Tired of searching for status updates?

Join OutLogger to be notified when any of your vendors or the components you use experience an outage. It's completely free and takes less than 2 minutes!

Sign Up Now

Components and Services Monitored for Mezmo

Outlogger tracks the status of these components for Xero:

Alerting Active
Archiving Active
Livetail Active
Log Ingestion (Agent/REST API/Code Libraries) Active
Log Ingestion (Heroku) Active
Log Ingestion (Syslog) Active
Search Active
Web App Active
Destinations Active
Ingestion / Sources Active
Processors Active
Web App Active
Component Status
Active
Alerting Active
Archiving Active
Livetail Active
Log Ingestion (Agent/REST API/Code Libraries) Active
Log Ingestion (Heroku) Active
Log Ingestion (Syslog) Active
Search Active
Web App Active
Active
Destinations Active
Ingestion / Sources Active
Processors Active
Web App Active

Latest Mezmo outages and incidents.

View the latest incidents for Mezmo and check for official updates:

Updates:

  • Time: Jan. 19, 2022, 7:55 p.m.
    Status: Postmortem
    Update: **Dates:** Start Time: Tuesday, January 18, 2022, at 21:00:00 UTC End Time: Wednesday, January 19, 2022, at 05:30:00 UTC Duration: 8:30:00 ‌ **What happened:** Our Web UI returned an error when customers tried to login or load pages. The errors persisted for short intervals – about 1-2 minutes each – then returned to normal usage. There were about 20 such intervals over the course of 4\+ hours. The ingestion of logs was also halted during these 1-2 minute intervals. All LogDNA agents running on customer environments quickly resent the logs. Alerting was halted for the duration of the incident and new sessions of Live Tail could not be started. ‌ **Why it happened:** We updated our parser service, which required scaling down all pods and restarting them. A new feature of the parser is to flush memory to our Redis database upon restart. The new flushing worked as intended, but also overwhelmed the database and made it unavailable to other services. This caused the pods running our Web UI and ingestion service to go into a “Not Ready” state; our API gateway then stopped sending traffic to these pods. When customers tried to load pages in the Web UI, the API gateway returned an error. When the Redis database became unresponsive, our alerting service stopped working and new sessions of Live Tail could not be started. Our monitoring of these services was inadequate and we were not alerted. ‌ **How we fixed it:** Restarting the parser server was unavoidable. We split the restart process for the parser into small segments to keep the intervals of unavailability as short as possible. In practice, there were 20 small restarts over 4\+ hours, each causing 1-2 minutes of unavailability. The WebUI and the ingestion service were fully operational by January 19, 01:21:00 UTC. On January 19, 5:30 UTC we manually restarted the Alerting and Live Tail services, which then returned to normal usage. ‌ **What we are doing to prevent it from happening again:** We’ve added code to slow down the shutdown process for the parser service to stagger the impact on our Redis database over time. Restarting the parser is uncommon; we intend to run load tests of restarts before any future updates of the parser in production are necessary, to confirm Redis is no longer affected by the new flushing behavior. We will improve our monitoring to alert us when services like Live Tail and Alerting are not functioning.
  • Time: Jan. 19, 2022, 1:21 a.m.
    Status: Resolved
    Update: This incident has been resolved. All services are fully operational.
  • Time: Jan. 18, 2022, 11:45 p.m.
    Status: Identified
    Update: We've identified the source of the failure and are taking action to correct it. The WebUI continues to be unavailable at times for intervals of 1-2 minutes each.
  • Time: Jan. 18, 2022, 10:59 p.m.
    Status: Investigating
    Update: Our WebUI is under maintenance and may not load pages consistently. We are working to recover as soon as possible.

Updates:

  • Time: Jan. 19, 2022, 7:55 p.m.
    Status: Postmortem
    Update: **Dates:** Start Time: Tuesday, January 18, 2022, at 21:00:00 UTC End Time: Wednesday, January 19, 2022, at 05:30:00 UTC Duration: 8:30:00 ‌ **What happened:** Our Web UI returned an error when customers tried to login or load pages. The errors persisted for short intervals – about 1-2 minutes each – then returned to normal usage. There were about 20 such intervals over the course of 4\+ hours. The ingestion of logs was also halted during these 1-2 minute intervals. All LogDNA agents running on customer environments quickly resent the logs. Alerting was halted for the duration of the incident and new sessions of Live Tail could not be started. ‌ **Why it happened:** We updated our parser service, which required scaling down all pods and restarting them. A new feature of the parser is to flush memory to our Redis database upon restart. The new flushing worked as intended, but also overwhelmed the database and made it unavailable to other services. This caused the pods running our Web UI and ingestion service to go into a “Not Ready” state; our API gateway then stopped sending traffic to these pods. When customers tried to load pages in the Web UI, the API gateway returned an error. When the Redis database became unresponsive, our alerting service stopped working and new sessions of Live Tail could not be started. Our monitoring of these services was inadequate and we were not alerted. ‌ **How we fixed it:** Restarting the parser server was unavoidable. We split the restart process for the parser into small segments to keep the intervals of unavailability as short as possible. In practice, there were 20 small restarts over 4\+ hours, each causing 1-2 minutes of unavailability. The WebUI and the ingestion service were fully operational by January 19, 01:21:00 UTC. On January 19, 5:30 UTC we manually restarted the Alerting and Live Tail services, which then returned to normal usage. ‌ **What we are doing to prevent it from happening again:** We’ve added code to slow down the shutdown process for the parser service to stagger the impact on our Redis database over time. Restarting the parser is uncommon; we intend to run load tests of restarts before any future updates of the parser in production are necessary, to confirm Redis is no longer affected by the new flushing behavior. We will improve our monitoring to alert us when services like Live Tail and Alerting are not functioning.
  • Time: Jan. 19, 2022, 1:21 a.m.
    Status: Resolved
    Update: This incident has been resolved. All services are fully operational.
  • Time: Jan. 18, 2022, 11:45 p.m.
    Status: Identified
    Update: We've identified the source of the failure and are taking action to correct it. The WebUI continues to be unavailable at times for intervals of 1-2 minutes each.
  • Time: Jan. 18, 2022, 10:59 p.m.
    Status: Investigating
    Update: Our WebUI is under maintenance and may not load pages consistently. We are working to recover as soon as possible.

Updates:

  • Time: Jan. 3, 2022, 8:21 p.m.
    Status: Resolved
    Update: This incident has been resolved.
  • Time: Jan. 3, 2022, 7:52 p.m.
    Status: Monitoring
    Update: A fix has been implemented and we are monitoring searching and live tail performance.
  • Time: Jan. 3, 2022, 7:12 p.m.
    Status: Investigating
    Update: We are currently experiencing delays for live tail and searching of newly ingested log data.

Updates:

  • Time: Jan. 3, 2022, 8:21 p.m.
    Status: Resolved
    Update: This incident has been resolved.
  • Time: Jan. 3, 2022, 7:52 p.m.
    Status: Monitoring
    Update: A fix has been implemented and we are monitoring searching and live tail performance.
  • Time: Jan. 3, 2022, 7:12 p.m.
    Status: Investigating
    Update: We are currently experiencing delays for live tail and searching of newly ingested log data.

Updates:

  • Time: Nov. 30, 2021, 8:44 p.m.
    Status: Postmortem
    Update: **Dates:** Start Time: Tuesday, November 23, 2021, at 16:42 UTC End Time: Wednesday, November 24, 2021, at 17:00 UTC Duration: 24:18:00 ‌ **What happened:** Newly submitted logs were not immediately available for Alerting, Searching, Live Tail, Graphing, and Timelines.  Some accounts \(about 25%\) were affected more than others. For all accounts, the ingestion of logs was not interrupted and no data was lost. ‌ **Why it happened:** Upon investigation, we discovered that the service which parses all incoming log lines was working very slowly.  This service is upstream to all our other services, such as alerting, live tail, archiving, and searching; consequently, all those services were also delayed. We isolated the slow parsing to the specific content of certain log lines.  These log lines exposed an inefficiency in our line parsing service which resulted in exponential growth in the time needed to parse those lines; this in turn created a bottleneck that delayed the parsing of other log lines.  The inefficiency has been present for some time, but went undetected until one account started sending a large volume of these problematic lines. ‌ **How we fixed it:** The line parsing service was updated to use a new algorithm that avoids the worst-case behaviors of the original, as well as improving performance for line parsing in general. From then on, the parsing service just needed time to process the backlog of logs sent to us by customers.  Likewise, the downstream services – alerting, live tail, archiving, searching – needed time to process the logs now being sent to them by the parsing service.  The recovery was quicker for about 75% of our customers and slower for the other 25%. ‌ **What we are doing to prevent it from happening again:** The new parsing methodology has improved our overall performance significantly.  We are also actively pursuing further optimizations.
  • Time: Nov. 30, 2021, 8:44 p.m.
    Status: Postmortem
    Update: **Dates:** Start Time: Tuesday, November 23, 2021, at 16:42 UTC End Time: Wednesday, November 24, 2021, at 17:00 UTC Duration: 24:18:00 ‌ **What happened:** Newly submitted logs were not immediately available for Alerting, Searching, Live Tail, Graphing, and Timelines.  Some accounts \(about 25%\) were affected more than others. For all accounts, the ingestion of logs was not interrupted and no data was lost. ‌ **Why it happened:** Upon investigation, we discovered that the service which parses all incoming log lines was working very slowly.  This service is upstream to all our other services, such as alerting, live tail, archiving, and searching; consequently, all those services were also delayed. We isolated the slow parsing to the specific content of certain log lines.  These log lines exposed an inefficiency in our line parsing service which resulted in exponential growth in the time needed to parse those lines; this in turn created a bottleneck that delayed the parsing of other log lines.  The inefficiency has been present for some time, but went undetected until one account started sending a large volume of these problematic lines. ‌ **How we fixed it:** The line parsing service was updated to use a new algorithm that avoids the worst-case behaviors of the original, as well as improving performance for line parsing in general. From then on, the parsing service just needed time to process the backlog of logs sent to us by customers.  Likewise, the downstream services – alerting, live tail, archiving, searching – needed time to process the logs now being sent to them by the parsing service.  The recovery was quicker for about 75% of our customers and slower for the other 25%. ‌ **What we are doing to prevent it from happening again:** The new parsing methodology has improved our overall performance significantly.  We are also actively pursuing further optimizations.
  • Time: Nov. 24, 2021, 5 p.m.
    Status: Resolved
    Update: This incident has been resolved. All services are fully operational.
  • Time: Nov. 24, 2021, 5 p.m.
    Status: Resolved
    Update: This incident has been resolved. All services are fully operational.
  • Time: Nov. 24, 2021, 6:57 a.m.
    Status: Investigating
    Update: Our services are recovering and there may be some delays in Alerting, Searching, Live Tail, Graphing, and Timelines. We are monitoring.
  • Time: Nov. 24, 2021, 6:57 a.m.
    Status: Investigating
    Update: Our services are recovering and there may be some delays in Alerting, Searching, Live Tail, Graphing, and Timelines. We are monitoring.
  • Time: Nov. 23, 2021, 8:03 p.m.
    Status: Investigating
    Update: Delays are still being experienced by some customers. We continue to work towards a solution.
  • Time: Nov. 23, 2021, 4:42 p.m.
    Status: Investigating
    Update: Some customers are experiencing delays in Alerting, Searching, Live Tail, Graphing, and Timelines. We are investigating and working to mitigate the issue.
  • Time: Nov. 23, 2021, 4:42 p.m.
    Status: Investigating
    Update: Some customers are experiencing delays in Alerting, Searching, Live Tail, Graphing, and Timelines. We are investigating and working to mitigate the issue.

Check the status of similar companies and alternatives to Mezmo

Hudl
Hudl

Systems Active

OutSystems
OutSystems

Systems Active

Postman
Postman

Systems Active

Mendix
Mendix

Systems Active

DigitalOcean
DigitalOcean

Issues Detected

Bandwidth
Bandwidth

Systems Active

DataRobot
DataRobot

Systems Active

Grafana Cloud
Grafana Cloud

Systems Active

SmartBear Software
SmartBear Software

Systems Active

Test IO
Test IO

Systems Active

Copado Solutions
Copado Solutions

Systems Active

CircleCI
CircleCI

Systems Active

Frequently Asked Questions - Mezmo

Is there a Mezmo outage?
The current status of Mezmo is: Systems Active
Where can I find the official status page of Mezmo?
The official status page for Mezmo is here
How can I get notified if Mezmo is down or experiencing an outage?
To get notified of any status changes to Mezmo, simply sign up to OutLogger's free monitoring service. OutLogger checks the official status of Mezmo every few minutes and will notify you of any changes. You can veiw the status of all your cloud vendors in one dashboard. Sign up here
What does Mezmo do?
Mezmo is a cloud-based tool that helps application owners manage and analyze important business data across different areas.