Mezmo Status: Check if Mezmo down or having an outage.

Mezmo outages and incidents

Outage and incident data over the last 30 days for Mezmo.

There have been 1 outages or incidents for Mezmo in the last 30 days.

Severity Breakdown:

None: 0

Minor: 0

Major: 0

Critical: 1

Tired of searching for status updates?

Join OutLogger to be notified when any of your vendors or the components you use experience an outage. It's completely free and takes less than 2 minutes!

Components and Services Monitored for Mezmo

Outlogger tracks the status of these components for Xero:

Log Analysis

Alerting Active

Archiving Active

Livetail Active

Log Ingestion (Agent/REST API/Code Libraries) Active

Log Ingestion (Heroku) Active

Log Ingestion (Syslog) Active

Search Active

Web App Active

Pipeline

Destinations Active

Ingestion / Sources Active

Processors Active

Web App Active

Component	Status
Log Analysis	Active
Alerting	Active
Archiving	Active
Livetail	Active
Log Ingestion (Agent/REST API/Code Libraries)	Active
Log Ingestion (Heroku)	Active
Log Ingestion (Syslog)	Active
Search	Active
Web App	Active
Pipeline	Active
Destinations	Active
Ingestion / Sources	Active
Processors	Active
Web App	Active

Latest Mezmo outages and incidents.

View the latest incidents for Mezmo and check for official updates:

Web UI and Ingestion are intermittently unavailable. Alerting is halted and new Live Tail sessions can't be started.

Description: **Dates:** Start Time: Tuesday, January 18, 2022, at 21:00:00 UTC End Time: Wednesday, January 19, 2022, at 05:30:00 UTC Duration: 8:30:00 ‌ **What happened:** Our Web UI returned an error when customers tried to login or load pages. The errors persisted for short intervals – about 1-2 minutes each – then returned to normal usage. There were about 20 such intervals over the course of 4\+ hours. The ingestion of logs was also halted during these 1-2 minute intervals. All LogDNA agents running on customer environments quickly resent the logs. Alerting was halted for the duration of the incident and new sessions of Live Tail could not be started. ‌ **Why it happened:** We updated our parser service, which required scaling down all pods and restarting them. A new feature of the parser is to flush memory to our Redis database upon restart. The new flushing worked as intended, but also overwhelmed the database and made it unavailable to other services. This caused the pods running our Web UI and ingestion service to go into a “Not Ready” state; our API gateway then stopped sending traffic to these pods. When customers tried to load pages in the Web UI, the API gateway returned an error. When the Redis database became unresponsive, our alerting service stopped working and new sessions of Live Tail could not be started. Our monitoring of these services was inadequate and we were not alerted. ‌ **How we fixed it:** Restarting the parser server was unavoidable. We split the restart process for the parser into small segments to keep the intervals of unavailability as short as possible. In practice, there were 20 small restarts over 4\+ hours, each causing 1-2 minutes of unavailability. The WebUI and the ingestion service were fully operational by January 19, 01:21:00 UTC. On January 19, 5:30 UTC we manually restarted the Alerting and Live Tail services, which then returned to normal usage. ‌ **What we are doing to prevent it from happening again:** We’ve added code to slow down the shutdown process for the parser service to stagger the impact on our Redis database over time. Restarting the parser is uncommon; we intend to run load tests of restarts before any future updates of the parser in production are necessary, to confirm Redis is no longer affected by the new flushing behavior. We will improve our monitoring to alert us when services like Live Tail and Alerting are not functioning.

Status: Postmortem

Impact: Minor | Started At: Jan. 18, 2022, 10:59 p.m.

Updates:

Time: Jan. 19, 2022, 7:55 p.m.

Status: Postmortem

Update: **Dates:** Start Time: Tuesday, January 18, 2022, at 21:00:00 UTC End Time: Wednesday, January 19, 2022, at 05:30:00 UTC Duration: 8:30:00 ‌ **What happened:** Our Web UI returned an error when customers tried to login or load pages. The errors persisted for short intervals – about 1-2 minutes each – then returned to normal usage. There were about 20 such intervals over the course of 4\+ hours. The ingestion of logs was also halted during these 1-2 minute intervals. All LogDNA agents running on customer environments quickly resent the logs. Alerting was halted for the duration of the incident and new sessions of Live Tail could not be started. ‌ **Why it happened:** We updated our parser service, which required scaling down all pods and restarting them. A new feature of the parser is to flush memory to our Redis database upon restart. The new flushing worked as intended, but also overwhelmed the database and made it unavailable to other services. This caused the pods running our Web UI and ingestion service to go into a “Not Ready” state; our API gateway then stopped sending traffic to these pods. When customers tried to load pages in the Web UI, the API gateway returned an error. When the Redis database became unresponsive, our alerting service stopped working and new sessions of Live Tail could not be started. Our monitoring of these services was inadequate and we were not alerted. ‌ **How we fixed it:** Restarting the parser server was unavoidable. We split the restart process for the parser into small segments to keep the intervals of unavailability as short as possible. In practice, there were 20 small restarts over 4\+ hours, each causing 1-2 minutes of unavailability. The WebUI and the ingestion service were fully operational by January 19, 01:21:00 UTC. On January 19, 5:30 UTC we manually restarted the Alerting and Live Tail services, which then returned to normal usage. ‌ **What we are doing to prevent it from happening again:** We’ve added code to slow down the shutdown process for the parser service to stagger the impact on our Redis database over time. Restarting the parser is uncommon; we intend to run load tests of restarts before any future updates of the parser in production are necessary, to confirm Redis is no longer affected by the new flushing behavior. We will improve our monitoring to alert us when services like Live Tail and Alerting are not functioning.
Time: Jan. 19, 2022, 1:21 a.m.

Status: Resolved

Update: This incident has been resolved. All services are fully operational.
Time: Jan. 18, 2022, 11:45 p.m.

Status: Identified

Update: We've identified the source of the failure and are taking action to correct it. The WebUI continues to be unavailable at times for intervals of 1-2 minutes each.
Time: Jan. 18, 2022, 10:59 p.m.

Status: Investigating

Update: Our WebUI is under maintenance and may not load pages consistently. We are working to recover as soon as possible.

Web UI and Ingestion are intermittently unavailable. Alerting is halted and new Live Tail sessions can't be started.

Status: Postmortem

Impact: Minor | Started At: Jan. 18, 2022, 10:59 p.m.

Updates:

Time: Jan. 19, 2022, 7:55 p.m.

Status: Postmortem

Update: **Dates:** Start Time: Tuesday, January 18, 2022, at 21:00:00 UTC End Time: Wednesday, January 19, 2022, at 05:30:00 UTC Duration: 8:30:00 ‌ **What happened:** Our Web UI returned an error when customers tried to login or load pages. The errors persisted for short intervals – about 1-2 minutes each – then returned to normal usage. There were about 20 such intervals over the course of 4\+ hours. The ingestion of logs was also halted during these 1-2 minute intervals. All LogDNA agents running on customer environments quickly resent the logs. Alerting was halted for the duration of the incident and new sessions of Live Tail could not be started. ‌ **Why it happened:** We updated our parser service, which required scaling down all pods and restarting them. A new feature of the parser is to flush memory to our Redis database upon restart. The new flushing worked as intended, but also overwhelmed the database and made it unavailable to other services. This caused the pods running our Web UI and ingestion service to go into a “Not Ready” state; our API gateway then stopped sending traffic to these pods. When customers tried to load pages in the Web UI, the API gateway returned an error. When the Redis database became unresponsive, our alerting service stopped working and new sessions of Live Tail could not be started. Our monitoring of these services was inadequate and we were not alerted. ‌ **How we fixed it:** Restarting the parser server was unavoidable. We split the restart process for the parser into small segments to keep the intervals of unavailability as short as possible. In practice, there were 20 small restarts over 4\+ hours, each causing 1-2 minutes of unavailability. The WebUI and the ingestion service were fully operational by January 19, 01:21:00 UTC. On January 19, 5:30 UTC we manually restarted the Alerting and Live Tail services, which then returned to normal usage. ‌ **What we are doing to prevent it from happening again:** We’ve added code to slow down the shutdown process for the parser service to stagger the impact on our Redis database over time. Restarting the parser is uncommon; we intend to run load tests of restarts before any future updates of the parser in production are necessary, to confirm Redis is no longer affected by the new flushing behavior. We will improve our monitoring to alert us when services like Live Tail and Alerting are not functioning.
Time: Jan. 19, 2022, 1:21 a.m.

Status: Resolved

Update: This incident has been resolved. All services are fully operational.
Time: Jan. 18, 2022, 11:45 p.m.

Status: Identified

Update: We've identified the source of the failure and are taking action to correct it. The WebUI continues to be unavailable at times for intervals of 1-2 minutes each.
Time: Jan. 18, 2022, 10:59 p.m.

Status: Investigating

Update: Our WebUI is under maintenance and may not load pages consistently. We are working to recover as soon as possible.

Live Tail and Searching Delays

Description: This incident has been resolved.

Status: Resolved

Impact: Minor | Started At: Jan. 3, 2022, 7:12 p.m.

Updates:

Time: Jan. 3, 2022, 8:21 p.m.

Status: Resolved

Update: This incident has been resolved.
Time: Jan. 3, 2022, 7:52 p.m.

Status: Monitoring

Update: A fix has been implemented and we are monitoring searching and live tail performance.
Time: Jan. 3, 2022, 7:12 p.m.

Status: Investigating

Update: We are currently experiencing delays for live tail and searching of newly ingested log data.

Live Tail and Searching Delays

Description: This incident has been resolved.

Status: Resolved

Impact: Minor | Started At: Jan. 3, 2022, 7:12 p.m.

Updates:

Time: Jan. 3, 2022, 8:21 p.m.

Status: Resolved

Update: This incident has been resolved.
Time: Jan. 3, 2022, 7:52 p.m.

Status: Monitoring

Update: A fix has been implemented and we are monitoring searching and live tail performance.
Time: Jan. 3, 2022, 7:12 p.m.

Status: Investigating

Update: We are currently experiencing delays for live tail and searching of newly ingested log data.

Alerting, Searching, Live Tail, Graphing, and Timelines Delays

Description: **Dates:** Start Time: Tuesday, November 23, 2021, at 16:42 UTC End Time: Wednesday, November 24, 2021, at 17:00 UTC Duration: 24:18:00 ‌ **What happened:** Newly submitted logs were not immediately available for Alerting, Searching, Live Tail, Graphing, and Timelines. Some accounts \(about 25%\) were affected more than others. For all accounts, the ingestion of logs was not interrupted and no data was lost. ‌ **Why it happened:** Upon investigation, we discovered that the service which parses all incoming log lines was working very slowly. This service is upstream to all our other services, such as alerting, live tail, archiving, and searching; consequently, all those services were also delayed. We isolated the slow parsing to the specific content of certain log lines. These log lines exposed an inefficiency in our line parsing service which resulted in exponential growth in the time needed to parse those lines; this in turn created a bottleneck that delayed the parsing of other log lines. The inefficiency has been present for some time, but went undetected until one account started sending a large volume of these problematic lines. ‌ **How we fixed it:** The line parsing service was updated to use a new algorithm that avoids the worst-case behaviors of the original, as well as improving performance for line parsing in general. From then on, the parsing service just needed time to process the backlog of logs sent to us by customers. Likewise, the downstream services – alerting, live tail, archiving, searching – needed time to process the logs now being sent to them by the parsing service. The recovery was quicker for about 75% of our customers and slower for the other 25%. ‌ **What we are doing to prevent it from happening again:** The new parsing methodology has improved our overall performance significantly. We are also actively pursuing further optimizations.

Status: Postmortem

Impact: Minor | Started At: Nov. 23, 2021, 4:42 p.m.

Updates:

Time: Nov. 30, 2021, 8:44 p.m.

Status: Postmortem

Update: **Dates:** Start Time: Tuesday, November 23, 2021, at 16:42 UTC End Time: Wednesday, November 24, 2021, at 17:00 UTC Duration: 24:18:00 ‌ **What happened:** Newly submitted logs were not immediately available for Alerting, Searching, Live Tail, Graphing, and Timelines. Some accounts \(about 25%\) were affected more than others. For all accounts, the ingestion of logs was not interrupted and no data was lost. ‌ **Why it happened:** Upon investigation, we discovered that the service which parses all incoming log lines was working very slowly. This service is upstream to all our other services, such as alerting, live tail, archiving, and searching; consequently, all those services were also delayed. We isolated the slow parsing to the specific content of certain log lines. These log lines exposed an inefficiency in our line parsing service which resulted in exponential growth in the time needed to parse those lines; this in turn created a bottleneck that delayed the parsing of other log lines. The inefficiency has been present for some time, but went undetected until one account started sending a large volume of these problematic lines. ‌ **How we fixed it:** The line parsing service was updated to use a new algorithm that avoids the worst-case behaviors of the original, as well as improving performance for line parsing in general. From then on, the parsing service just needed time to process the backlog of logs sent to us by customers. Likewise, the downstream services – alerting, live tail, archiving, searching – needed time to process the logs now being sent to them by the parsing service. The recovery was quicker for about 75% of our customers and slower for the other 25%. ‌ **What we are doing to prevent it from happening again:** The new parsing methodology has improved our overall performance significantly. We are also actively pursuing further optimizations.
Time: Nov. 30, 2021, 8:44 p.m.

Status: Postmortem

Update: **Dates:** Start Time: Tuesday, November 23, 2021, at 16:42 UTC End Time: Wednesday, November 24, 2021, at 17:00 UTC Duration: 24:18:00 ‌ **What happened:** Newly submitted logs were not immediately available for Alerting, Searching, Live Tail, Graphing, and Timelines. Some accounts \(about 25%\) were affected more than others. For all accounts, the ingestion of logs was not interrupted and no data was lost. ‌ **Why it happened:** Upon investigation, we discovered that the service which parses all incoming log lines was working very slowly. This service is upstream to all our other services, such as alerting, live tail, archiving, and searching; consequently, all those services were also delayed. We isolated the slow parsing to the specific content of certain log lines. These log lines exposed an inefficiency in our line parsing service which resulted in exponential growth in the time needed to parse those lines; this in turn created a bottleneck that delayed the parsing of other log lines. The inefficiency has been present for some time, but went undetected until one account started sending a large volume of these problematic lines. ‌ **How we fixed it:** The line parsing service was updated to use a new algorithm that avoids the worst-case behaviors of the original, as well as improving performance for line parsing in general. From then on, the parsing service just needed time to process the backlog of logs sent to us by customers. Likewise, the downstream services – alerting, live tail, archiving, searching – needed time to process the logs now being sent to them by the parsing service. The recovery was quicker for about 75% of our customers and slower for the other 25%. ‌ **What we are doing to prevent it from happening again:** The new parsing methodology has improved our overall performance significantly. We are also actively pursuing further optimizations.
Time: Nov. 24, 2021, 5 p.m.

Status: Resolved

Update: This incident has been resolved. All services are fully operational.
Time: Nov. 24, 2021, 5 p.m.

Status: Resolved

Update: This incident has been resolved. All services are fully operational.
Time: Nov. 24, 2021, 6:57 a.m.

Status: Investigating

Update: Our services are recovering and there may be some delays in Alerting, Searching, Live Tail, Graphing, and Timelines. We are monitoring.
Time: Nov. 24, 2021, 6:57 a.m.

Status: Investigating

Update: Our services are recovering and there may be some delays in Alerting, Searching, Live Tail, Graphing, and Timelines. We are monitoring.
Time: Nov. 23, 2021, 8:03 p.m.

Status: Investigating

Update: Delays are still being experienced by some customers. We continue to work towards a solution.
Time: Nov. 23, 2021, 4:42 p.m.

Status: Investigating

Update: Some customers are experiencing delays in Alerting, Searching, Live Tail, Graphing, and Timelines. We are investigating and working to mitigate the issue.
Time: Nov. 23, 2021, 4:42 p.m.

Status: Investigating

Update: Some customers are experiencing delays in Alerting, Searching, Live Tail, Graphing, and Timelines. We are investigating and working to mitigate the issue.

Check the status of similar companies and alternatives to Mezmo

Hudl

Systems Active

OutSystems

Systems Active

Postman

Systems Active

Mendix

Systems Active

DigitalOcean

Issues Detected

Bandwidth

Systems Active

DataRobot

Systems Active

Grafana Cloud

Systems Active

SmartBear Software

Systems Active

Test IO

Systems Active

Copado Solutions

Systems Active

CircleCI

Systems Active

Frequently Asked Questions - Mezmo

Is there a Mezmo outage?

The current status of Mezmo is: Systems Active

Where can I find the official status page of Mezmo?

The official status page for Mezmo is here

How can I get notified if Mezmo is down or experiencing an outage?

To get notified of any status changes to Mezmo, simply sign up to OutLogger's free monitoring service. OutLogger checks the official status of Mezmo every few minutes and will notify you of any changes. You can veiw the status of all your cloud vendors in one dashboard. Sign up here

What does Mezmo do?

Mezmo is a cloud-based tool that helps application owners manage and analyze important business data across different areas.

Is there an Mezmo outage?

Mezmo status: Systems Active

Mezmo outages and incidents

There have been 1 outages or incidents for Mezmo in the last 30 days.

Severity Breakdown:

Tired of searching for status updates?

Components and Services Monitored for Mezmo

Log Analysis

Pipeline

Latest Mezmo outages and incidents.

Web UI and Ingestion are intermittently unavailable. Alerting is halted and new Live Tail sessions can't be started.

Updates:

Web UI and Ingestion are intermittently unavailable. Alerting is halted and new Live Tail sessions can't be started.

Updates:

Live Tail and Searching Delays

Updates:

Live Tail and Searching Delays

Updates:

Alerting, Searching, Live Tail, Graphing, and Timelines Delays

Updates:

Check the status of similar companies and alternatives to Mezmo

Hudl

OutSystems

Postman

Mendix

DigitalOcean

Bandwidth

DataRobot

Grafana Cloud

SmartBear Software

Test IO

Copado Solutions

CircleCI

Frequently Asked Questions - Mezmo

Is there a Mezmo outage?

Where can I find the official status page of Mezmo?

How can I get notified if Mezmo is down or experiencing an outage?

What does Mezmo do?

Start monitoring now!