Mezmo Status: Check if Mezmo down or having an outage.

Mezmo outages and incidents

Outage and incident data over the last 30 days for Mezmo.

There have been 1 outages or incidents for Mezmo in the last 30 days.

Severity Breakdown:

None: 0

Minor: 0

Major: 0

Critical: 1

Tired of searching for status updates?

Join OutLogger to be notified when any of your vendors or the components you use experience an outage. It's completely free and takes less than 2 minutes!

Components and Services Monitored for Mezmo

Outlogger tracks the status of these components for Xero:

Log Analysis

Alerting Active

Archiving Active

Livetail Active

Log Ingestion (Agent/REST API/Code Libraries) Active

Log Ingestion (Heroku) Active

Log Ingestion (Syslog) Active

Search Active

Web App Active

Pipeline

Destinations Active

Ingestion / Sources Active

Processors Active

Web App Active

Component	Status
Log Analysis	Active
Alerting	Active
Archiving	Active
Livetail	Active
Log Ingestion (Agent/REST API/Code Libraries)	Active
Log Ingestion (Heroku)	Active
Log Ingestion (Syslog)	Active
Search	Active
Web App	Active
Pipeline	Active
Destinations	Active
Ingestion / Sources	Active
Processors	Active
Web App	Active

Latest Mezmo outages and incidents.

View the latest incidents for Mezmo and check for official updates:

Indexing, Livetail Performance, Search, and Alerting Delays

Description: **Dates:** Start Time: Monday, November 22, 2021, at 19:01 UTC End Time: Tuesday, November 23, 2021, at 02:04 UTC Duration: 7:03:00 ‌ **What happened:** Newly submitted logs were not immediately available for Alerting, Searching, Live Tail, Graphing, and Timelines. Some accounts \(about 25%\) were affected more than others. For all accounts, the ingestion of logs was not interrupted and no data was lost. ‌ **Why it happened:** Upon investigation, we discovered that the service which parses all incoming log lines was working very slowly. This service is upstream to all our other services, such as alerting, live tail, archiving, and searching; consequently, all those services were also delayed. We isolated the slow parsing to the specific content of certain log lines. These log lines exposed an inefficiency in our line parsing service which resulted in exponential growth in the time needed to parse those lines; this in turn created a bottleneck that delayed the parsing of other log lines. The inefficiency has been present for some time, but went undetected until one account started sending a large volume of these problematic lines. ‌ **How we fixed it:** The line parsing service was updated to use a new algorithm that avoids the worst-case behaviors of the original, as well as improving performance for line parsing in general. From then on, the parsing service just needed time to process the backlog of logs sent to us by customers. Likewise, the downstream services – alerting, live tail, archiving, searching – needed time to process the logs now being sent to them by the parsing service. The recovery was quicker for about 75% of our customers and slower for the other 25%. ‌ **What we are doing to prevent it from happening again:** The new parsing methodology has improved our overall performance significantly. We are also actively pursuing further optimizations.

Status: Postmortem

Impact: Minor | Started At: Nov. 22, 2021, 7:01 p.m.

Updates:

Time: Nov. 30, 2021, 8:41 p.m.

Status: Postmortem

Update: **Dates:** Start Time: Monday, November 22, 2021, at 19:01 UTC End Time: Tuesday, November 23, 2021, at 02:04 UTC Duration: 7:03:00 ‌ **What happened:** Newly submitted logs were not immediately available for Alerting, Searching, Live Tail, Graphing, and Timelines. Some accounts \(about 25%\) were affected more than others. For all accounts, the ingestion of logs was not interrupted and no data was lost. ‌ **Why it happened:** Upon investigation, we discovered that the service which parses all incoming log lines was working very slowly. This service is upstream to all our other services, such as alerting, live tail, archiving, and searching; consequently, all those services were also delayed. We isolated the slow parsing to the specific content of certain log lines. These log lines exposed an inefficiency in our line parsing service which resulted in exponential growth in the time needed to parse those lines; this in turn created a bottleneck that delayed the parsing of other log lines. The inefficiency has been present for some time, but went undetected until one account started sending a large volume of these problematic lines. ‌ **How we fixed it:** The line parsing service was updated to use a new algorithm that avoids the worst-case behaviors of the original, as well as improving performance for line parsing in general. From then on, the parsing service just needed time to process the backlog of logs sent to us by customers. Likewise, the downstream services – alerting, live tail, archiving, searching – needed time to process the logs now being sent to them by the parsing service. The recovery was quicker for about 75% of our customers and slower for the other 25%. ‌ **What we are doing to prevent it from happening again:** The new parsing methodology has improved our overall performance significantly. We are also actively pursuing further optimizations.
Time: Nov. 23, 2021, 2:04 a.m.

Status: Resolved

Update: This incident has been resolved. All services are fully operational.
Time: Nov. 23, 2021, 12:51 a.m.

Status: Monitoring

Update: Our services are mostly back to normal; we are monitoring.
Time: Nov. 22, 2021, 8:54 p.m.

Status: Investigating

Update: We are still actively investigating and working on a fix for the issue.
Time: Nov. 22, 2021, 7:53 p.m.

Status: Investigating

Update: There continue to be delays with processing newly sent log lines. Additionally, some alerts are not triggering.
Time: Nov. 22, 2021, 7:01 p.m.

Status: Investigating

Update: We are currently experiencing delays in searching for newly ingested log data, livetail, and alerts at this time. We are investigating and working quickly to mitigate the issue.

Indexing, Livetail Performance, Search, and Alerting Delays

Status: Postmortem

Impact: Minor | Started At: Nov. 22, 2021, 7:01 p.m.

Updates:

Time: Nov. 30, 2021, 8:41 p.m.

Status: Postmortem

Update: **Dates:** Start Time: Monday, November 22, 2021, at 19:01 UTC End Time: Tuesday, November 23, 2021, at 02:04 UTC Duration: 7:03:00 ‌ **What happened:** Newly submitted logs were not immediately available for Alerting, Searching, Live Tail, Graphing, and Timelines. Some accounts \(about 25%\) were affected more than others. For all accounts, the ingestion of logs was not interrupted and no data was lost. ‌ **Why it happened:** Upon investigation, we discovered that the service which parses all incoming log lines was working very slowly. This service is upstream to all our other services, such as alerting, live tail, archiving, and searching; consequently, all those services were also delayed. We isolated the slow parsing to the specific content of certain log lines. These log lines exposed an inefficiency in our line parsing service which resulted in exponential growth in the time needed to parse those lines; this in turn created a bottleneck that delayed the parsing of other log lines. The inefficiency has been present for some time, but went undetected until one account started sending a large volume of these problematic lines. ‌ **How we fixed it:** The line parsing service was updated to use a new algorithm that avoids the worst-case behaviors of the original, as well as improving performance for line parsing in general. From then on, the parsing service just needed time to process the backlog of logs sent to us by customers. Likewise, the downstream services – alerting, live tail, archiving, searching – needed time to process the logs now being sent to them by the parsing service. The recovery was quicker for about 75% of our customers and slower for the other 25%. ‌ **What we are doing to prevent it from happening again:** The new parsing methodology has improved our overall performance significantly. We are also actively pursuing further optimizations.
Time: Nov. 23, 2021, 2:04 a.m.

Status: Resolved

Update: This incident has been resolved. All services are fully operational.
Time: Nov. 23, 2021, 12:51 a.m.

Status: Monitoring

Update: Our services are mostly back to normal; we are monitoring.
Time: Nov. 22, 2021, 8:54 p.m.

Status: Investigating

Update: We are still actively investigating and working on a fix for the issue.
Time: Nov. 22, 2021, 7:53 p.m.

Status: Investigating

Update: There continue to be delays with processing newly sent log lines. Additionally, some alerts are not triggering.
Time: Nov. 22, 2021, 7:01 p.m.

Status: Investigating

Update: We are currently experiencing delays in searching for newly ingested log data, livetail, and alerts at this time. We are investigating and working quickly to mitigate the issue.

Web App is Unreachable

Description: **Start Time:** Monday, November 8, 2021, at 23:28 UTC **End Time:** Tuesday, November 9, 2021, at 00:16 UTC **Duration:** 0:48:00 ‌ **What happened:** Our Web UI returned the error message “This site can’t be reached” when users tried to login or load pages. The ingestion of logs was unaffected. ‌ **Why it happened:** The node our web service was running on had a failure with its network management software and became unreachable. Furthermore, the web service was only running on a single node, which is atypical – usually it runs on multiple nodes at once to improve performance and allow for redundancy. Both conditions were necessary for the Web UI to become unavailable. ‌ **How we fixed it:** We moved the web service to another node with functioning network management software, which made the Web UI available again. Later, we restarted the unreachable node, which restored it to normal usage. ‌ **What we are doing to prevent it from happening again:** We expect both necessary conditions – the failure of the network management software and that the web service was running on a single node – to be resolved by an already planned migration of our entire service to a new cloud-based environment. We are currently building monitoring of the availability of our Web UI so we can learn of any future failures as soon as possible.

Status: Postmortem

Impact: None | Started At: Nov. 9, 2021, midnight

Updates:

Time: Nov. 10, 2021, 11:33 p.m.

Status: Postmortem

Update: **Start Time:** Monday, November 8, 2021, at 23:28 UTC **End Time:** Tuesday, November 9, 2021, at 00:16 UTC **Duration:** 0:48:00 ‌ **What happened:** Our Web UI returned the error message “This site can’t be reached” when users tried to login or load pages. The ingestion of logs was unaffected. ‌ **Why it happened:** The node our web service was running on had a failure with its network management software and became unreachable. Furthermore, the web service was only running on a single node, which is atypical – usually it runs on multiple nodes at once to improve performance and allow for redundancy. Both conditions were necessary for the Web UI to become unavailable. ‌ **How we fixed it:** We moved the web service to another node with functioning network management software, which made the Web UI available again. Later, we restarted the unreachable node, which restored it to normal usage. ‌ **What we are doing to prevent it from happening again:** We expect both necessary conditions – the failure of the network management software and that the web service was running on a single node – to be resolved by an already planned migration of our entire service to a new cloud-based environment. We are currently building monitoring of the availability of our Web UI so we can learn of any future failures as soon as possible.
Time: Nov. 10, 2021, 11:33 p.m.

Status: Postmortem

Update: **Start Time:** Monday, November 8, 2021, at 23:28 UTC **End Time:** Tuesday, November 9, 2021, at 00:16 UTC **Duration:** 0:48:00 ‌ **What happened:** Our Web UI returned the error message “This site can’t be reached” when users tried to login or load pages. The ingestion of logs was unaffected. ‌ **Why it happened:** The node our web service was running on had a failure with its network management software and became unreachable. Furthermore, the web service was only running on a single node, which is atypical – usually it runs on multiple nodes at once to improve performance and allow for redundancy. Both conditions were necessary for the Web UI to become unavailable. ‌ **How we fixed it:** We moved the web service to another node with functioning network management software, which made the Web UI available again. Later, we restarted the unreachable node, which restored it to normal usage. ‌ **What we are doing to prevent it from happening again:** We expect both necessary conditions – the failure of the network management software and that the web service was running on a single node – to be resolved by an already planned migration of our entire service to a new cloud-based environment. We are currently building monitoring of the availability of our Web UI so we can learn of any future failures as soon as possible.
Time: Nov. 9, 2021, 12:31 a.m.

Status: Resolved

Update: We experienced a temporary network issue that affected our web app at 00:00 UTC. Our team has taken remedial action to bring our services back to normal. All services are operational.
Time: Nov. 9, 2021, 12:31 a.m.

Status: Resolved

Update: We experienced a temporary network issue that affected our web app at 00:00 UTC. Our team has taken remedial action to bring our services back to normal. All services are operational.

Emails Are Not Being Sent

Description: **Start Time:** Thursday, October 28, 2021, at 16:56:52 UTC **End Time:** Thursday, October 28, 2021, at 22:17:24 UTC **Duration:** 5:20:32 ‌ **What happened:** Email notifications of all kinds, including from alerts, were delayed for about 5 hours. Notifications sent by Slack and Webhooks were not affected. **Why it happened:** Our email service provider \(Sparkpost\) experienced an incident that caused delays for all emails from the LogDNA service. We rely on this service to deliver email of all kinds, including notifications for alerts. Email messages were delayed and queued until our email service provider was able to recover. More information on the incident can be found at Sparkpost’s Status Page: [https://status.sparkpost.com/incidents/bwl8dr6gwmts?u=ydzrh5x205pf](https://status.sparkpost.com/incidents/bwl8dr6gwmts?u=ydzrh5x205pf) **How we fixed it:** No remedial action was possible by LogDNA. We waited until the incident from Sparkpost, our email hosting provider, was resolved. **What we are doing to prevent it from happening again:** For this type of incident, LogDNA cannot take proactive preventive measures.

Status: Postmortem

Impact: Major | Started At: Oct. 28, 2021, 6:13 p.m.

Updates:

Time: Nov. 3, 2021, 6:15 p.m.

Status: Postmortem

Update: **Start Time:** Thursday, October 28, 2021, at 16:56:52 UTC **End Time:** Thursday, October 28, 2021, at 22:17:24 UTC **Duration:** 5:20:32 ‌ **What happened:** Email notifications of all kinds, including from alerts, were delayed for about 5 hours. Notifications sent by Slack and Webhooks were not affected. **Why it happened:** Our email service provider \(Sparkpost\) experienced an incident that caused delays for all emails from the LogDNA service. We rely on this service to deliver email of all kinds, including notifications for alerts. Email messages were delayed and queued until our email service provider was able to recover. More information on the incident can be found at Sparkpost’s Status Page: [https://status.sparkpost.com/incidents/bwl8dr6gwmts?u=ydzrh5x205pf](https://status.sparkpost.com/incidents/bwl8dr6gwmts?u=ydzrh5x205pf) **How we fixed it:** No remedial action was possible by LogDNA. We waited until the incident from Sparkpost, our email hosting provider, was resolved. **What we are doing to prevent it from happening again:** For this type of incident, LogDNA cannot take proactive preventive measures.
Time: Nov. 3, 2021, 6:15 p.m.

Status: Postmortem

Update: **Start Time:** Thursday, October 28, 2021, at 16:56:52 UTC **End Time:** Thursday, October 28, 2021, at 22:17:24 UTC **Duration:** 5:20:32 ‌ **What happened:** Email notifications of all kinds, including from alerts, were delayed for about 5 hours. Notifications sent by Slack and Webhooks were not affected. **Why it happened:** Our email service provider \(Sparkpost\) experienced an incident that caused delays for all emails from the LogDNA service. We rely on this service to deliver email of all kinds, including notifications for alerts. Email messages were delayed and queued until our email service provider was able to recover. More information on the incident can be found at Sparkpost’s Status Page: [https://status.sparkpost.com/incidents/bwl8dr6gwmts?u=ydzrh5x205pf](https://status.sparkpost.com/incidents/bwl8dr6gwmts?u=ydzrh5x205pf) **How we fixed it:** No remedial action was possible by LogDNA. We waited until the incident from Sparkpost, our email hosting provider, was resolved. **What we are doing to prevent it from happening again:** For this type of incident, LogDNA cannot take proactive preventive measures.
Time: Oct. 28, 2021, 11:22 p.m.

Status: Resolved

Update: Our email alerting feature has been restored to normal operation. All services are fully functional.
Time: Oct. 28, 2021, 11:22 p.m.

Status: Resolved

Update: Our email alerting feature has been restored to normal operation. All services are fully functional.
Time: Oct. 28, 2021, 9:02 p.m.

Status: Identified

Update: Our email provider reports that outbound message delivery has resumed but it is not yet fully operational. Our provider will keep unsent emails in their queue and continue to try to send them.
Time: Oct. 28, 2021, 9:02 p.m.

Status: Identified

Update: Our email provider reports that outbound message delivery has resumed but it is not yet fully operational. Our provider will keep unsent emails in their queue and continue to try to send them.
Time: Oct. 28, 2021, 6:13 p.m.

Status: Investigating

Update: Our email alerting feature is not working at the moment and customers are not receiving alerts by email in US region. This is due to an ongoing incident with our email provider; see https://status.sparkpost.com/incidents/bwl8dr6gwmts?u=ydzrh5x205pf for more detail. Other types of alerts, such as Slack and webhook, are still working. We are investigating.
Time: Oct. 28, 2021, 6:13 p.m.

Status: Investigating

Update: Our email alerting feature is not working at the moment and customers are not receiving alerts by email in US region. This is due to an ongoing incident with our email provider; see https://status.sparkpost.com/incidents/bwl8dr6gwmts?u=ydzrh5x205pf for more detail. Other types of alerts, such as Slack and webhook, are still working. We are investigating.

Some customers are not able to login to our UI

Description: Start Time: Thursday, October 7, 2021, at 17:52 UTC End Time: Thursday, October 7, 2021, at 18:46 UTC Duration: 0:54:00 ## What happened: Our Web UI returned the error message “This site can’t be reached” when some users tried to login or load pages. The ingestion of logs was unaffected. ## Why it happened: The Telia carrier service in Europe experienced a major network routing outage caused by a faulty configuration update. The routing policy contained an error that impacted traffic to our service hosting provider, Equinix Metal. The Washington DC data center that houses our services was impacted. During this incident the [app.logdna.com](http://app.logdna.com) site was unreachable for some customers, depending on their location. ## How we fixed it: No remedial action was possible by LogDNA. We waited until the incident from Equinix Metal, our service hosting provider, was resolved. ## What we are doing to prevent it from happening again: For this type of incident, LogDNA cannot take proactive preventive measures.

Status: Postmortem

Impact: Major | Started At: Oct. 7, 2021, 5:52 p.m.

Updates:

Time: Oct. 14, 2021, 5:40 p.m.

Status: Postmortem

Update: Start Time: Thursday, October 7, 2021, at 17:52 UTC End Time: Thursday, October 7, 2021, at 18:46 UTC Duration: 0:54:00 ## What happened: Our Web UI returned the error message “This site can’t be reached” when some users tried to login or load pages. The ingestion of logs was unaffected. ## Why it happened: The Telia carrier service in Europe experienced a major network routing outage caused by a faulty configuration update. The routing policy contained an error that impacted traffic to our service hosting provider, Equinix Metal. The Washington DC data center that houses our services was impacted. During this incident the [app.logdna.com](http://app.logdna.com) site was unreachable for some customers, depending on their location. ## How we fixed it: No remedial action was possible by LogDNA. We waited until the incident from Equinix Metal, our service hosting provider, was resolved. ## What we are doing to prevent it from happening again: For this type of incident, LogDNA cannot take proactive preventive measures.
Time: Oct. 7, 2021, 6:46 p.m.

Status: Resolved

Update: Logins are working for all customers. All services are operational.
Time: Oct. 7, 2021, 6:02 p.m.

Status: Monitoring

Update: Logins to our UI appear to be working again for all customers. We are monitoring for any further failures.
Time: Oct. 7, 2021, 5:52 p.m.

Status: Identified

Update: Some customers are not able to login to our UI. It appears this is due to an incident with our cloud provider Equinix. See their status page https://status.equinixmetal.com/incidents/wgg6kl862tl6.

Check the status of similar companies and alternatives to Mezmo

Hudl

Systems Active

OutSystems

Systems Active

Postman

Systems Active

Mendix

Systems Active

DigitalOcean

Issues Detected

Bandwidth

Systems Active

DataRobot

Systems Active

Grafana Cloud

Systems Active

SmartBear Software

Systems Active

Test IO

Systems Active

Copado Solutions

Systems Active

CircleCI

Systems Active

Frequently Asked Questions - Mezmo

Is there a Mezmo outage?

The current status of Mezmo is: Systems Active

Where can I find the official status page of Mezmo?

The official status page for Mezmo is here

How can I get notified if Mezmo is down or experiencing an outage?

To get notified of any status changes to Mezmo, simply sign up to OutLogger's free monitoring service. OutLogger checks the official status of Mezmo every few minutes and will notify you of any changes. You can veiw the status of all your cloud vendors in one dashboard. Sign up here

What does Mezmo do?

Mezmo is a cloud-based tool that helps application owners manage and analyze important business data across different areas.

Is there an Mezmo outage?

Mezmo status: Systems Active

Mezmo outages and incidents

There have been 1 outages or incidents for Mezmo in the last 30 days.

Severity Breakdown:

Tired of searching for status updates?

Components and Services Monitored for Mezmo

Log Analysis

Pipeline

Latest Mezmo outages and incidents.

Indexing, Livetail Performance, Search, and Alerting Delays

Updates:

Indexing, Livetail Performance, Search, and Alerting Delays

Updates:

Web App is Unreachable

Updates:

Emails Are Not Being Sent

Updates:

Some customers are not able to login to our UI

Updates:

Check the status of similar companies and alternatives to Mezmo

Hudl

OutSystems

Postman

Mendix

DigitalOcean

Bandwidth

DataRobot

Grafana Cloud

SmartBear Software

Test IO

Copado Solutions

CircleCI

Frequently Asked Questions - Mezmo

Is there a Mezmo outage?

Where can I find the official status page of Mezmo?

How can I get notified if Mezmo is down or experiencing an outage?

What does Mezmo do?

Start monitoring now!