RebelMouse Status: Check if RebelMouse down or having an outage.

RebelMouse outages and incidents

Outage and incident data over the last 30 days for RebelMouse.

There have been 0 outages or incidents for RebelMouse in the last 30 days.

Severity Breakdown:

None: 0

Minor: 0

Major: 0

Critical: 0

Tired of searching for status updates?

Join OutLogger to be notified when any of your vendors or the components you use experience an outage. It's completely free and takes less than 2 minutes!

Components and Services Monitored for RebelMouse

Outlogger tracks the status of these components for Xero:

AWS ec2-us-east-1 Active

AWS elb-us-east-1 Active

AWS RDS Active

AWS route53 Active

AWS s3-us-standard Active

AWS ses-us-east-1 Active

Braintree API Active

Braintree PayPal Processing Active

CDN Active

Celery Active

Content Delivery API Active

Discovery Active

EKS Cluster Active

Facebook Active

Fastly Amsterdam (AMS) Active

Fastly Hong Kong (HKG) Active

Fastly London (LHR) Active

Fastly Los Angeles (LAX) Active

Fastly New York (JFK) Active

Fastly Sydney (SYD) Active

Full Platform Active

Google Apps Analytics Active

Logged In Users Active

Media Active

Mongo Cluster Active

Pharos Active

RabbitMQ Active

Redis Cluster Active

Sentry Dashboard Active

Stats Active

Talaria Active

Twitter Active

WFE Active

Component	Status
AWS ec2-us-east-1	Active
AWS elb-us-east-1	Active
AWS RDS	Active
AWS route53	Active
AWS s3-us-standard	Active
AWS ses-us-east-1	Active
Braintree API	Active
Braintree PayPal Processing	Active
CDN	Active
Celery	Active
Content Delivery API	Active
Discovery	Active
EKS Cluster	Active
Facebook	Active
Fastly Amsterdam (AMS)	Active
Fastly Hong Kong (HKG)	Active
Fastly London (LHR)	Active
Fastly Los Angeles (LAX)	Active
Fastly New York (JFK)	Active
Fastly Sydney (SYD)	Active
Full Platform	Active
Google Apps Analytics	Active
Logged In Users	Active
Media	Active
Mongo Cluster	Active
Pharos	Active
RabbitMQ	Active
Redis Cluster	Active
Sentry Dashboard	Active
Stats	Active
Talaria	Active
Twitter	Active
WFE	Active

Latest RebelMouse outages and incidents.

View the latest incidents for RebelMouse and check for official updates:

performance degradation

Description: # Chronology of the incident At 16:27 UTC, we detected a significant load on our servers. By 16:43 UTC, it was identified that the CoreDNS server was suffering performance degradation due to the scaling out of applications within our Kubernetes cluster. This situation was further complicated by a performance degradation in our MongoDB database at 16:55 UTC, caused by an excessive open number of connections initiated by the scaling applications. An emergency meeting was escalated at 17:04 UTC and the source of the excessive load to DNS servers was identified at 17:16 UTC. Measures were immediately taken to optimize DNS queries across the Kubernetes cluster by reducing the number of DNS clients, which mainly involved halting non-essential services. These measures led to an initial recovery of performance at 17:30 UTC and subsequently, a fix was developed for the CoreDNS configuration, identified as the root cause of the issues. Unfortunately, at 19:16 UTC, a restart of CoreDNS led to a performance degradation on the editorial clusters and unveiled the unavailability of one of the MongoDB replica set instances. This was attributed to the restart, which caused a cache purge, thus highlighting the magnitude of the MongoDB performance degradation. We identified that this issue with MongoDB had a significant impact on the performance of our CoreDNS systems, further complicating the situation. Recognizing the severity of the situation, we immediately launched a recovery process for the MongoDB replica set. As we progressed with damage control, a preliminary trial was made to reinstate these services. Despite our efforts, the services reactivation led to significant setbacks, notably impacting the overall performance of the editorial web platform. However, it's crucial to note that the websites for end users and crawlers maintained its functionality and continued to operate as expected with no major degradation. To reinforce our commitment to operational stability, we opted to keep the service offline pending a comprehensive investigation and resolution of the underlying issues with the MongoDB database. These measures facilitated a full recovery of the MongoDB system by 21:10 UTC. Post recovery, we continued to monitor the situation over a specified period before cautiously reactivating services signaled the end of the active incident. # The impact of the incident While the website for end-users and crawlers function without meaningful disruption, the incident resulted in partial performance degradation of editorial clusters and non essential services, like automations or javascript runtimes. # The underlying cause The incident was triggered by a combination of factors, including an aggressive web crawler, a surge in cache invalidations due to layout updates, and a suboptimal configuration of CoreDNS. # Actions taken & Preventive Measures Reconfigured CoreDNS setup significantly increasing the service capacity. As a preventive measure we are going to create an update for our inhouse built cache logic to spread the cache revalidation process in time to prevent requests spikes to “origins”

Status: Postmortem

Impact: None | Started At: May 29, 2024, 5:09 p.m.

Updates:

Time: May 31, 2024, 12:28 p.m.

Status: Postmortem

Update: # Chronology of the incident At 16:27 UTC, we detected a significant load on our servers. By 16:43 UTC, it was identified that the CoreDNS server was suffering performance degradation due to the scaling out of applications within our Kubernetes cluster. This situation was further complicated by a performance degradation in our MongoDB database at 16:55 UTC, caused by an excessive open number of connections initiated by the scaling applications. An emergency meeting was escalated at 17:04 UTC and the source of the excessive load to DNS servers was identified at 17:16 UTC. Measures were immediately taken to optimize DNS queries across the Kubernetes cluster by reducing the number of DNS clients, which mainly involved halting non-essential services. These measures led to an initial recovery of performance at 17:30 UTC and subsequently, a fix was developed for the CoreDNS configuration, identified as the root cause of the issues. Unfortunately, at 19:16 UTC, a restart of CoreDNS led to a performance degradation on the editorial clusters and unveiled the unavailability of one of the MongoDB replica set instances. This was attributed to the restart, which caused a cache purge, thus highlighting the magnitude of the MongoDB performance degradation. We identified that this issue with MongoDB had a significant impact on the performance of our CoreDNS systems, further complicating the situation. Recognizing the severity of the situation, we immediately launched a recovery process for the MongoDB replica set. As we progressed with damage control, a preliminary trial was made to reinstate these services. Despite our efforts, the services reactivation led to significant setbacks, notably impacting the overall performance of the editorial web platform. However, it's crucial to note that the websites for end users and crawlers maintained its functionality and continued to operate as expected with no major degradation. To reinforce our commitment to operational stability, we opted to keep the service offline pending a comprehensive investigation and resolution of the underlying issues with the MongoDB database. These measures facilitated a full recovery of the MongoDB system by 21:10 UTC. Post recovery, we continued to monitor the situation over a specified period before cautiously reactivating services signaled the end of the active incident. # The impact of the incident While the website for end-users and crawlers function without meaningful disruption, the incident resulted in partial performance degradation of editorial clusters and non essential services, like automations or javascript runtimes. # The underlying cause The incident was triggered by a combination of factors, including an aggressive web crawler, a surge in cache invalidations due to layout updates, and a suboptimal configuration of CoreDNS. # Actions taken & Preventive Measures Reconfigured CoreDNS setup significantly increasing the service capacity. As a preventive measure we are going to create an update for our inhouse built cache logic to spread the cache revalidation process in time to prevent requests spikes to “origins”
Time: May 29, 2024, 7:10 p.m.

Status: Resolved

Update: This incident has been resolved.
Time: May 29, 2024, 5:36 p.m.

Status: Monitoring

Update: The fix was implemented and deployed. We are monitoring the situation
Time: May 29, 2024, 5:20 p.m.

Status: Identified

Update: We have identified the root cause and working on the fix now
Time: May 29, 2024, 5:13 p.m.

Status: Identified

Update: The issue has been identified and a fix is being implemented.
Time: May 29, 2024, 5:09 p.m.

Status: Investigating

Update: We are experiencing performance degradation. We are investigating of what is the reason for it now

Performance degradation

Description: **Chronology of the incident** * Apr 25, 2024, 05:12 PM UTC: RebelMouse received an alert from internal monitoring systems about a significantly increased error rate. * Apr 25, 2024, 05:12 PM UTC: DevOps team started to check the systems. * Apr 25, 2024, 05:23 PM UTC: RebelMouse published the status portal about performance degradation. * Apr 25, 2024, 05:26 PM UTC: The problem was identified as an overload of Talaria \(Smart Cache Service\). * Apr 25, 2024, 05:42 PM UTC: Traffic was rerouted bypassing the Talaria. This action restored the performance for the end users. * Apr 25, 2024, 06:00 PM UTC: Changes in the configuration were applied to increase the resources for Talaria. * Apr 25, 2024, 06:09 PM UTC: Talaria was re-enabled. * Apr 26, 2024, 01:06 PM UTC: Incident was marked as resolved **The impact of the incident** The incident resulted in performance degradation, leading to periods of unavailability for public pages or delays in publishing the content. **The underlying cause** Increased amount of traffic caused the overload of the Talaria. **Actions taken & Preventive Measures** We've reviewed the configuration of the Talaria service, added additional resources to it and optimized the autoscaling rules. Our autoscaling system operates on preset rules designed to accommodate anticipated loads. However, as traffic patterns shift over time, it's essential to periodically review and adjust these rules accordingly.

Status: Postmortem

Impact: Minor | Started At: April 25, 2024, 5:23 p.m.

Updates:

Time: May 2, 2024, 3:20 p.m.

Status: Postmortem

Update: **Chronology of the incident** * Apr 25, 2024, 05:12 PM UTC: RebelMouse received an alert from internal monitoring systems about a significantly increased error rate. * Apr 25, 2024, 05:12 PM UTC: DevOps team started to check the systems. * Apr 25, 2024, 05:23 PM UTC: RebelMouse published the status portal about performance degradation. * Apr 25, 2024, 05:26 PM UTC: The problem was identified as an overload of Talaria \(Smart Cache Service\). * Apr 25, 2024, 05:42 PM UTC: Traffic was rerouted bypassing the Talaria. This action restored the performance for the end users. * Apr 25, 2024, 06:00 PM UTC: Changes in the configuration were applied to increase the resources for Talaria. * Apr 25, 2024, 06:09 PM UTC: Talaria was re-enabled. * Apr 26, 2024, 01:06 PM UTC: Incident was marked as resolved **The impact of the incident** The incident resulted in performance degradation, leading to periods of unavailability for public pages or delays in publishing the content. **The underlying cause** Increased amount of traffic caused the overload of the Talaria. **Actions taken & Preventive Measures** We've reviewed the configuration of the Talaria service, added additional resources to it and optimized the autoscaling rules. Our autoscaling system operates on preset rules designed to accommodate anticipated loads. However, as traffic patterns shift over time, it's essential to periodically review and adjust these rules accordingly.
Time: May 2, 2024, 3:20 p.m.

Status: Postmortem

Update: **Chronology of the incident** * Apr 25, 2024, 05:12 PM UTC: RebelMouse received an alert from internal monitoring systems about a significantly increased error rate. * Apr 25, 2024, 05:12 PM UTC: DevOps team started to check the systems. * Apr 25, 2024, 05:23 PM UTC: RebelMouse published the status portal about performance degradation. * Apr 25, 2024, 05:26 PM UTC: The problem was identified as an overload of Talaria \(Smart Cache Service\). * Apr 25, 2024, 05:42 PM UTC: Traffic was rerouted bypassing the Talaria. This action restored the performance for the end users. * Apr 25, 2024, 06:00 PM UTC: Changes in the configuration were applied to increase the resources for Talaria. * Apr 25, 2024, 06:09 PM UTC: Talaria was re-enabled. * Apr 26, 2024, 01:06 PM UTC: Incident was marked as resolved **The impact of the incident** The incident resulted in performance degradation, leading to periods of unavailability for public pages or delays in publishing the content. **The underlying cause** Increased amount of traffic caused the overload of the Talaria. **Actions taken & Preventive Measures** We've reviewed the configuration of the Talaria service, added additional resources to it and optimized the autoscaling rules. Our autoscaling system operates on preset rules designed to accommodate anticipated loads. However, as traffic patterns shift over time, it's essential to periodically review and adjust these rules accordingly.
Time: April 26, 2024, 1:06 a.m.

Status: Resolved

Update: This incident has been resolved.
Time: April 26, 2024, 1:06 a.m.

Status: Resolved

Update: This incident has been resolved.
Time: April 25, 2024, 6:06 p.m.

Status: Monitoring

Update: A fix has been implemented and we are monitoring the results.
Time: April 25, 2024, 6:06 p.m.

Status: Monitoring

Update: A fix has been implemented and we are monitoring the results.
Time: April 25, 2024, 5:23 p.m.

Status: Investigating

Update: We are currently investigating this issue.
Time: April 25, 2024, 5:23 p.m.

Status: Investigating

Update: We are currently investigating this issue.

Performance degradation

Description: ## Chronology of the incident: * Mar 27, 2024, 01:02 PM UTC RebelMouse received an alert from internal monitoring systems about slight increased error rate * Mar 27, 2024, 01:07 PM UTC DevOps team checked the systems, noticed a short load spike and error rate already went to normal * Mar 27, 2024, 02:12 PM UTC RebelMouse received an alert from internal monitoring systems about slight increased error rate * Mar 27, 2024, 02:14 PM UTC RebelMouse team members observed a degradation in performance across certain services and promptly reported an incident. However, this degradation in performance was temporary and not permanent. * Mar 27, 2024, 02:22 PM UTC A dedicated incident resolution team was assembled, initiating an investigation * Mar 27, 2024, 02:44 PM UTC Significant anomalies in traffic were identified, prompting the allocation of extra resources to the cluster tasked with handling said traffic. * Mar 27, 2024, 02:53 PM UTC The incident resolution team transitioned into monitoring mode. * Mar 27, 2024, 04:27 PM UTC RebelMouse received an alert from internal monitoring systems about significant increased error rate * Mar 27, 2024, 04:30 PM UTC The incident resolution team decided to fully reroute the suspicious traffic to an independent cluster * Mar 27, 2024, 04:57 PM UTC The suspicious traffic was isolated in the independent cluster * Mar 27, 2024, 05:03 PM UTC RebelMouse published the status portal message * Mar 27, 2024, 05:04 PM UTC The incident resolution team shifted into monitoring mode and concurrently began exploring potential enhancements in case of any recurrence of the issue. * Mar 27, 2024, 05:48 PM UTC RebelMouse received reports from the clients about the degradation in performance and also alerts from monitoring systems. * Mar 27, 2024, 05:58 PM UTC The root cause of the problem was identified * Mar 27, 2024, 06:05 PM UTC The fix was implemented * Mar 28, 2024 An independent cluster was established specifically for editorial traffic to safeguard its functionality from potential disruptions caused by other services ## The impact of the incident The incident resulted in intermittent performance degradation, leading to periods of unavailability for editorial tools. ## The underlying cause if known The `Broken Links` service shared endpoints with critical editorial tools such as the `Entry Editor` or `Posts Dashboard`. Periodically, this service generated long-running requests, causing health checks to fail and Kubernetes to deem the pods unhealthy. Consequently, Kubernetes terminated these pods and initiated their recreation. This process resulted in temporary unavailability of the affected services during the restart. ## Actions taken & Preventive Measures An independent cluster was established specifically for editorial traffic to safeguard its functionality from potential disruptions caused by other services

Status: Postmortem

Impact: Minor | Started At: March 27, 2024, 5:02 p.m.

Updates:

Time: March 29, 2024, 9:50 a.m.

Status: Postmortem

Update: ## Chronology of the incident: * Mar 27, 2024, 01:02 PM UTC RebelMouse received an alert from internal monitoring systems about slight increased error rate * Mar 27, 2024, 01:07 PM UTC DevOps team checked the systems, noticed a short load spike and error rate already went to normal * Mar 27, 2024, 02:12 PM UTC RebelMouse received an alert from internal monitoring systems about slight increased error rate * Mar 27, 2024, 02:14 PM UTC RebelMouse team members observed a degradation in performance across certain services and promptly reported an incident. However, this degradation in performance was temporary and not permanent. * Mar 27, 2024, 02:22 PM UTC A dedicated incident resolution team was assembled, initiating an investigation * Mar 27, 2024, 02:44 PM UTC Significant anomalies in traffic were identified, prompting the allocation of extra resources to the cluster tasked with handling said traffic. * Mar 27, 2024, 02:53 PM UTC The incident resolution team transitioned into monitoring mode. * Mar 27, 2024, 04:27 PM UTC RebelMouse received an alert from internal monitoring systems about significant increased error rate * Mar 27, 2024, 04:30 PM UTC The incident resolution team decided to fully reroute the suspicious traffic to an independent cluster * Mar 27, 2024, 04:57 PM UTC The suspicious traffic was isolated in the independent cluster * Mar 27, 2024, 05:03 PM UTC RebelMouse published the status portal message * Mar 27, 2024, 05:04 PM UTC The incident resolution team shifted into monitoring mode and concurrently began exploring potential enhancements in case of any recurrence of the issue. * Mar 27, 2024, 05:48 PM UTC RebelMouse received reports from the clients about the degradation in performance and also alerts from monitoring systems. * Mar 27, 2024, 05:58 PM UTC The root cause of the problem was identified * Mar 27, 2024, 06:05 PM UTC The fix was implemented * Mar 28, 2024 An independent cluster was established specifically for editorial traffic to safeguard its functionality from potential disruptions caused by other services ## The impact of the incident The incident resulted in intermittent performance degradation, leading to periods of unavailability for editorial tools. ## The underlying cause if known The `Broken Links` service shared endpoints with critical editorial tools such as the `Entry Editor` or `Posts Dashboard`. Periodically, this service generated long-running requests, causing health checks to fail and Kubernetes to deem the pods unhealthy. Consequently, Kubernetes terminated these pods and initiated their recreation. This process resulted in temporary unavailability of the affected services during the restart. ## Actions taken & Preventive Measures An independent cluster was established specifically for editorial traffic to safeguard its functionality from potential disruptions caused by other services
Time: March 27, 2024, 7:51 p.m.

Status: Resolved

Update: This incident has been resolved.
Time: March 27, 2024, 6:08 p.m.

Status: Monitoring

Update: A fix has been implemented and we are monitoring the results.
Time: March 27, 2024, 5:50 p.m.

Status: Investigating

Update: We are currently investigating this issue.
Time: March 27, 2024, 5:31 p.m.

Status: Monitoring

Update: There was an isolated surge of highly suspicious traffic. While we aren't certain of its origin or source, we have isolated it away from the production clusters to its own environment. This means all productions systems should have returned to normal and we believe the problem is under control. We don't fully understand the why behind this yet though so we will be updating this with more details soon.
Time: March 27, 2024, 5:02 p.m.

Status: Investigating

Update: We've are experiencing the performance degradation for the logged in users.

Performance degradation

Status: Postmortem

Impact: Minor | Started At: March 27, 2024, 5:02 p.m.

Updates:

Time: March 29, 2024, 9:50 a.m.

Status: Postmortem

Update: ## Chronology of the incident: * Mar 27, 2024, 01:02 PM UTC RebelMouse received an alert from internal monitoring systems about slight increased error rate * Mar 27, 2024, 01:07 PM UTC DevOps team checked the systems, noticed a short load spike and error rate already went to normal * Mar 27, 2024, 02:12 PM UTC RebelMouse received an alert from internal monitoring systems about slight increased error rate * Mar 27, 2024, 02:14 PM UTC RebelMouse team members observed a degradation in performance across certain services and promptly reported an incident. However, this degradation in performance was temporary and not permanent. * Mar 27, 2024, 02:22 PM UTC A dedicated incident resolution team was assembled, initiating an investigation * Mar 27, 2024, 02:44 PM UTC Significant anomalies in traffic were identified, prompting the allocation of extra resources to the cluster tasked with handling said traffic. * Mar 27, 2024, 02:53 PM UTC The incident resolution team transitioned into monitoring mode. * Mar 27, 2024, 04:27 PM UTC RebelMouse received an alert from internal monitoring systems about significant increased error rate * Mar 27, 2024, 04:30 PM UTC The incident resolution team decided to fully reroute the suspicious traffic to an independent cluster * Mar 27, 2024, 04:57 PM UTC The suspicious traffic was isolated in the independent cluster * Mar 27, 2024, 05:03 PM UTC RebelMouse published the status portal message * Mar 27, 2024, 05:04 PM UTC The incident resolution team shifted into monitoring mode and concurrently began exploring potential enhancements in case of any recurrence of the issue. * Mar 27, 2024, 05:48 PM UTC RebelMouse received reports from the clients about the degradation in performance and also alerts from monitoring systems. * Mar 27, 2024, 05:58 PM UTC The root cause of the problem was identified * Mar 27, 2024, 06:05 PM UTC The fix was implemented * Mar 28, 2024 An independent cluster was established specifically for editorial traffic to safeguard its functionality from potential disruptions caused by other services ## The impact of the incident The incident resulted in intermittent performance degradation, leading to periods of unavailability for editorial tools. ## The underlying cause if known The `Broken Links` service shared endpoints with critical editorial tools such as the `Entry Editor` or `Posts Dashboard`. Periodically, this service generated long-running requests, causing health checks to fail and Kubernetes to deem the pods unhealthy. Consequently, Kubernetes terminated these pods and initiated their recreation. This process resulted in temporary unavailability of the affected services during the restart. ## Actions taken & Preventive Measures An independent cluster was established specifically for editorial traffic to safeguard its functionality from potential disruptions caused by other services
Time: March 27, 2024, 7:51 p.m.

Status: Resolved

Update: This incident has been resolved.
Time: March 27, 2024, 6:08 p.m.

Status: Monitoring

Update: A fix has been implemented and we are monitoring the results.
Time: March 27, 2024, 5:50 p.m.

Status: Investigating

Update: We are currently investigating this issue.
Time: March 27, 2024, 5:31 p.m.

Status: Monitoring

Update: There was an isolated surge of highly suspicious traffic. While we aren't certain of its origin or source, we have isolated it away from the production clusters to its own environment. This means all productions systems should have returned to normal and we believe the problem is under control. We don't fully understand the why behind this yet though so we will be updating this with more details soon.
Time: March 27, 2024, 5:02 p.m.

Status: Investigating

Update: We've are experiencing the performance degradation for the logged in users.

Performance issues

Description: ## **Chronology of the incident** Feb 8, 2024, 4:20 PM EST – An increase in error rate was observed. Feb 8, 2024, 4:25 PM EST – Monitoring systems detected anomalies, prompting the RebelMouse team to initiate an investigation. Feb 8, 2024, 5:00 PM EST – Error rates experienced a significant surge. Feb 8, 2024, 5:16 PM EST – The RebelMouse team officially categorized the incident as Major and communicated it through the Status Portal. Feb 8, 2024, 5:30 PM EST – The root cause was pinpointed: unavailability in launching new instances within the EKS cluster. Feb 8, 2024, 6:00 PM EST – The RebelMouse team rectified the issue by updating the network configuration and manually launching required instances to restore system performance. Feb 8, 2024, 8:51 PM EST – RebelMouse initiated a support request regarding AWS services outage. Feb 8, 2024, 9:10 PM EST – Systems reconfiguration was completed, and the team entered monitoring mode. Feb 8, 2024, 10:10 PM EST – The incident was officially resolved. Feb 10, 2024, 2:30 AM EST – AWS confirmed an issue with the EKS service in the us-east-1 region during the specified period, and services have been restored. ## **The impact of the incident** Stores multiple key services hosted on AWS us-east-1 region for RebelMouse were impacted leading to partial unavailability. The underlying cause if known The root cause of this problem has been identified as a networking issue within AWS, specifically affecting the EKS service within the us-east-1 region. AWS acknowledged the issue and the team was actively working on resolving it. ## **Actions taken** RebelMouse engineering teams were engaged as soon as the problem was identified. They worked diligently to resolve the issue in the fastest manner possible while keeping customers updated about the situation. ## **Preventive Measures** We have recognized the importance of enhancing our strategies for handling potential networking issues. Going forward, we will seek opportunities to mitigate these challenges by implementing extensive caching systems and boosting our redundant capacity for caching.

Status: Postmortem

Impact: Minor | Started At: Feb. 8, 2024, 10:16 p.m.

Updates:

Time: Feb. 12, 2024, 7:41 p.m.

Status: Postmortem

Update: ## **Chronology of the incident** Feb 8, 2024, 4:20 PM EST – An increase in error rate was observed. Feb 8, 2024, 4:25 PM EST – Monitoring systems detected anomalies, prompting the RebelMouse team to initiate an investigation. Feb 8, 2024, 5:00 PM EST – Error rates experienced a significant surge. Feb 8, 2024, 5:16 PM EST – The RebelMouse team officially categorized the incident as Major and communicated it through the Status Portal. Feb 8, 2024, 5:30 PM EST – The root cause was pinpointed: unavailability in launching new instances within the EKS cluster. Feb 8, 2024, 6:00 PM EST – The RebelMouse team rectified the issue by updating the network configuration and manually launching required instances to restore system performance. Feb 8, 2024, 8:51 PM EST – RebelMouse initiated a support request regarding AWS services outage. Feb 8, 2024, 9:10 PM EST – Systems reconfiguration was completed, and the team entered monitoring mode. Feb 8, 2024, 10:10 PM EST – The incident was officially resolved. Feb 10, 2024, 2:30 AM EST – AWS confirmed an issue with the EKS service in the us-east-1 region during the specified period, and services have been restored. ## **The impact of the incident** Stores multiple key services hosted on AWS us-east-1 region for RebelMouse were impacted leading to partial unavailability. The underlying cause if known The root cause of this problem has been identified as a networking issue within AWS, specifically affecting the EKS service within the us-east-1 region. AWS acknowledged the issue and the team was actively working on resolving it. ## **Actions taken** RebelMouse engineering teams were engaged as soon as the problem was identified. They worked diligently to resolve the issue in the fastest manner possible while keeping customers updated about the situation. ## **Preventive Measures** We have recognized the importance of enhancing our strategies for handling potential networking issues. Going forward, we will seek opportunities to mitigate these challenges by implementing extensive caching systems and boosting our redundant capacity for caching.
Time: Feb. 9, 2024, 3:10 a.m.

Status: Resolved

Update: This incident has been resolved.
Time: Feb. 9, 2024, 2:12 a.m.

Status: Monitoring

Update: We identified the root cause and deployed a fix for it and now we are monitoring application performance
Time: Feb. 8, 2024, 11:47 p.m.

Status: Identified

Update: We have replaced the last servers and expecting performance to get back to normal in a couple of minutes
Time: Feb. 8, 2024, 11:05 p.m.

Status: Identified

Update: Newly added servers are functioning correctly and we see an improvement in performance. We are now keep adding new servers and manually removing old one that have issues
Time: Feb. 8, 2024, 10:52 p.m.

Status: Identified

Update: We are adding new servers manually to increase a capacity to resolve performance degradation
Time: Feb. 8, 2024, 10:30 p.m.

Status: Identified

Update: We identified that the issue is caused by Kubernetes cluster not being able to launch new instances. We are working on a fix for that right now
Time: Feb. 8, 2024, 10:16 p.m.

Status: Investigating

Update: We are experiencing a performance degradation. We are investigating what is a root cause of it right now.

Check the status of similar companies and alternatives to RebelMouse

Anaplan

Systems Active

SpotOn

Systems Active

impact.com

Systems Active

Replicon

Systems Active

Adjust

Systems Active

Quantcast

Systems Active

Acoustic

Systems Active

Pantheon Operations

Systems Active

DealerSocket

Systems Active

Mixpanel

Systems Active

Invoca

Systems Active

Ceros

Systems Active

Frequently Asked Questions - RebelMouse

Is there a RebelMouse outage?

The current status of RebelMouse is: Systems Active

Where can I find the official status page of RebelMouse?

The official status page for RebelMouse is here

How can I get notified if RebelMouse is down or experiencing an outage?

To get notified of any status changes to RebelMouse, simply sign up to OutLogger's free monitoring service. OutLogger checks the official status of RebelMouse every few minutes and will notify you of any changes. You can veiw the status of all your cloud vendors in one dashboard. Sign up here

What does RebelMouse do?

RebelMouse is a publishing platform and creative agency that combines product and strategy to drive organic traffic, user growth, loyalty, and revenue success.

Is there an RebelMouse outage?

RebelMouse status: Systems Active

RebelMouse outages and incidents

There have been 0 outages or incidents for RebelMouse in the last 30 days.

Severity Breakdown:

Tired of searching for status updates?

Components and Services Monitored for RebelMouse

Latest RebelMouse outages and incidents.

performance degradation

Updates:

Performance degradation

Updates:

Performance degradation

Updates:

Performance degradation

Updates:

Performance issues

Updates:

Check the status of similar companies and alternatives to RebelMouse

Anaplan

SpotOn

impact.com

Replicon

Adjust

Quantcast

Acoustic

Pantheon Operations

DealerSocket

Mixpanel

Invoca

Ceros

Frequently Asked Questions - RebelMouse

Is there a RebelMouse outage?

Where can I find the official status page of RebelMouse?

How can I get notified if RebelMouse is down or experiencing an outage?

What does RebelMouse do?

Start monitoring now!