Company Logo

Is there an RebelMouse outage?

RebelMouse status: Systems Active

Last checked: 5 minutes ago

Get notified about any outages, downtime or incidents for RebelMouse and 1800+ other cloud vendors. Monitor 10 companies, for free.

Subscribe for updates

RebelMouse outages and incidents

Outage and incident data over the last 30 days for RebelMouse.

There have been 0 outages or incidents for RebelMouse in the last 30 days.

Severity Breakdown:

Tired of searching for status updates?

Join OutLogger to be notified when any of your vendors or the components you use experience an outage. It's completely free and takes less than 2 minutes!

Sign Up Now

Components and Services Monitored for RebelMouse

Outlogger tracks the status of these components for Xero:

AWS ec2-us-east-1 Active
AWS elb-us-east-1 Active
AWS RDS Active
AWS route53 Active
AWS s3-us-standard Active
AWS ses-us-east-1 Active
Braintree API Active
Braintree PayPal Processing Active
CDN Active
Celery Active
Content Delivery API Active
Discovery Active
EKS Cluster Active
Facebook Active
Fastly Amsterdam (AMS) Active
Fastly Hong Kong (HKG) Active
Fastly London (LHR) Active
Fastly Los Angeles (LAX) Active
Fastly New York (JFK) Active
Fastly Sydney (SYD) Active
Full Platform Active
Google Apps Analytics Active
Logged In Users Active
Media Active
Mongo Cluster Active
Pharos Active
RabbitMQ Active
Redis Cluster Active
Sentry Dashboard Active
Stats Active
Talaria Active
Twitter Active
WFE Active
Component Status
AWS ec2-us-east-1 Active
AWS elb-us-east-1 Active
AWS RDS Active
AWS route53 Active
AWS s3-us-standard Active
AWS ses-us-east-1 Active
Braintree API Active
Braintree PayPal Processing Active
CDN Active
Celery Active
Content Delivery API Active
Discovery Active
EKS Cluster Active
Facebook Active
Fastly Amsterdam (AMS) Active
Fastly Hong Kong (HKG) Active
Fastly London (LHR) Active
Fastly Los Angeles (LAX) Active
Fastly New York (JFK) Active
Fastly Sydney (SYD) Active
Full Platform Active
Google Apps Analytics Active
Logged In Users Active
Media Active
Mongo Cluster Active
Pharos Active
RabbitMQ Active
Redis Cluster Active
Sentry Dashboard Active
Stats Active
Talaria Active
Twitter Active
WFE Active

Latest RebelMouse outages and incidents.

View the latest incidents for RebelMouse and check for official updates:

Updates:

  • Time: May 31, 2024, 12:28 p.m.
    Status: Postmortem
    Update: # Chronology of the incident At 16:27 UTC, we detected a significant load on our servers. By 16:43 UTC, it was identified that the CoreDNS server was suffering performance degradation due to the scaling out of applications within our Kubernetes cluster. This situation was further complicated by a performance degradation in our MongoDB database at 16:55 UTC, caused by an excessive open number of connections initiated by the scaling applications. An emergency meeting was escalated at 17:04 UTC and the source of the excessive load to DNS servers was identified at 17:16 UTC. Measures were immediately taken to optimize DNS queries across the Kubernetes cluster by reducing the number of DNS clients, which mainly involved halting non-essential services. These measures led to an initial recovery of performance at 17:30 UTC and subsequently, a fix was developed for the CoreDNS configuration, identified as the root cause of the issues. Unfortunately, at 19:16 UTC, a restart of CoreDNS led to a performance degradation on the editorial clusters and unveiled the unavailability of one of the MongoDB replica set instances. This was attributed to the restart, which caused a cache purge, thus highlighting the magnitude of the MongoDB performance degradation. We identified that this issue with MongoDB had a significant impact on the performance of our CoreDNS systems, further complicating the situation. Recognizing the severity of the situation, we immediately launched a recovery process for the MongoDB replica set. As we progressed with damage control, a preliminary trial was made to reinstate these services. Despite our efforts, the services reactivation led to significant setbacks, notably impacting the overall performance of the editorial web platform. However, it's crucial to note that the websites for end users and crawlers maintained its functionality and continued to operate as expected with no major degradation. To reinforce our commitment to operational stability, we opted to keep the service offline pending a comprehensive investigation and resolution of the underlying issues with the MongoDB database. These measures facilitated a full recovery of the MongoDB system by 21:10 UTC. Post recovery, we continued to monitor the situation over a specified period before cautiously reactivating services signaled the end of the active incident. # The impact of the incident While the website for end-users and crawlers function without meaningful disruption, the incident resulted in partial performance degradation of editorial clusters and non essential services, like automations or javascript runtimes. # The underlying cause The incident was triggered by a combination of factors, including an aggressive web crawler, a surge in cache invalidations due to layout updates, and a suboptimal configuration of CoreDNS. # Actions taken & Preventive Measures Reconfigured CoreDNS setup significantly increasing the service capacity. As a preventive measure we are going to create an update for our inhouse built cache logic to spread the cache revalidation process in time to prevent requests spikes to “origins”
  • Time: May 29, 2024, 7:10 p.m.
    Status: Resolved
    Update: This incident has been resolved.
  • Time: May 29, 2024, 5:36 p.m.
    Status: Monitoring
    Update: The fix was implemented and deployed. We are monitoring the situation
  • Time: May 29, 2024, 5:20 p.m.
    Status: Identified
    Update: We have identified the root cause and working on the fix now
  • Time: May 29, 2024, 5:13 p.m.
    Status: Identified
    Update: The issue has been identified and a fix is being implemented.
  • Time: May 29, 2024, 5:09 p.m.
    Status: Investigating
    Update: We are experiencing performance degradation. We are investigating of what is the reason for it now

Updates:

  • Time: May 2, 2024, 3:20 p.m.
    Status: Postmortem
    Update: **Chronology of the incident**  * Apr 25, 2024, 05:12 PM UTC: RebelMouse received an alert from internal monitoring systems about a significantly increased error rate. * Apr 25, 2024, 05:12 PM UTC: DevOps team started to check the systems. * Apr 25, 2024, 05:23 PM UTC: RebelMouse published the status portal about performance degradation. * Apr 25, 2024, 05:26 PM UTC: The problem was identified as an overload of Talaria \(Smart Cache Service\).  * Apr 25, 2024, 05:42 PM UTC: Traffic was rerouted bypassing the Talaria. This action restored the performance for the end users. * Apr 25, 2024, 06:00 PM UTC: Changes in the configuration were applied to increase the resources for Talaria. * Apr 25, 2024, 06:09 PM UTC: Talaria was re-enabled. * Apr 26, 2024, 01:06 PM UTC: Incident was marked as resolved **The impact of the incident** The incident resulted in performance degradation, leading to periods of unavailability for public pages or delays in publishing the content. **The underlying cause** Increased amount of traffic caused the overload of the Talaria. **Actions taken & Preventive Measures** We've reviewed the configuration of the Talaria service, added additional resources to it and optimized the autoscaling rules. Our autoscaling system operates on preset rules designed to accommodate anticipated loads. However, as traffic patterns shift over time, it's essential to periodically review and adjust these rules accordingly.
  • Time: May 2, 2024, 3:20 p.m.
    Status: Postmortem
    Update: **Chronology of the incident**  * Apr 25, 2024, 05:12 PM UTC: RebelMouse received an alert from internal monitoring systems about a significantly increased error rate. * Apr 25, 2024, 05:12 PM UTC: DevOps team started to check the systems. * Apr 25, 2024, 05:23 PM UTC: RebelMouse published the status portal about performance degradation. * Apr 25, 2024, 05:26 PM UTC: The problem was identified as an overload of Talaria \(Smart Cache Service\).  * Apr 25, 2024, 05:42 PM UTC: Traffic was rerouted bypassing the Talaria. This action restored the performance for the end users. * Apr 25, 2024, 06:00 PM UTC: Changes in the configuration were applied to increase the resources for Talaria. * Apr 25, 2024, 06:09 PM UTC: Talaria was re-enabled. * Apr 26, 2024, 01:06 PM UTC: Incident was marked as resolved **The impact of the incident** The incident resulted in performance degradation, leading to periods of unavailability for public pages or delays in publishing the content. **The underlying cause** Increased amount of traffic caused the overload of the Talaria. **Actions taken & Preventive Measures** We've reviewed the configuration of the Talaria service, added additional resources to it and optimized the autoscaling rules. Our autoscaling system operates on preset rules designed to accommodate anticipated loads. However, as traffic patterns shift over time, it's essential to periodically review and adjust these rules accordingly.
  • Time: April 26, 2024, 1:06 a.m.
    Status: Resolved
    Update: This incident has been resolved.
  • Time: April 26, 2024, 1:06 a.m.
    Status: Resolved
    Update: This incident has been resolved.
  • Time: April 25, 2024, 6:06 p.m.
    Status: Monitoring
    Update: A fix has been implemented and we are monitoring the results.
  • Time: April 25, 2024, 6:06 p.m.
    Status: Monitoring
    Update: A fix has been implemented and we are monitoring the results.
  • Time: April 25, 2024, 5:23 p.m.
    Status: Investigating
    Update: We are currently investigating this issue.
  • Time: April 25, 2024, 5:23 p.m.
    Status: Investigating
    Update: We are currently investigating this issue.

Updates:

  • Time: March 29, 2024, 9:50 a.m.
    Status: Postmortem
    Update: ## Chronology of the incident: * Mar 27, 2024, 01:02 PM UTC RebelMouse received an alert from internal monitoring systems about slight increased error rate * Mar 27, 2024, 01:07 PM UTC DevOps team checked the systems, noticed a short load spike and error rate already went to normal * Mar 27, 2024, 02:12 PM UTC RebelMouse received an alert from internal monitoring systems about slight increased error rate * Mar 27, 2024, 02:14 PM UTC RebelMouse team members observed a degradation in performance across certain services and promptly reported an incident. However, this degradation in performance was temporary and not permanent. * Mar 27, 2024, 02:22 PM UTC A dedicated incident resolution team was assembled, initiating an investigation * Mar 27, 2024, 02:44 PM UTC Significant anomalies in traffic were identified, prompting the allocation of extra resources to the cluster tasked with handling said traffic. * Mar 27, 2024, 02:53 PM UTC The incident resolution team transitioned into monitoring mode. * Mar 27, 2024, 04:27 PM UTC RebelMouse received an alert from internal monitoring systems about significant increased error rate * Mar 27, 2024, 04:30 PM UTC The incident resolution team decided to fully reroute the suspicious traffic to an independent cluster  * Mar 27, 2024, 04:57 PM UTC The suspicious traffic was isolated in the independent cluster  * Mar 27, 2024, 05:03 PM UTC RebelMouse published the status portal message * Mar 27, 2024, 05:04 PM UTC The incident resolution team shifted into monitoring mode and concurrently began exploring potential enhancements in case of any recurrence of the issue. * Mar 27, 2024, 05:48 PM UTC RebelMouse received reports from the clients about the degradation in performance and also alerts from monitoring systems. * Mar 27, 2024, 05:58 PM UTC The root cause of the problem was identified * Mar 27, 2024, 06:05 PM UTC The fix was implemented * Mar 28, 2024 An independent cluster was established specifically for editorial traffic to safeguard its functionality from potential disruptions caused by other services ## The impact of the incident The incident resulted in intermittent performance degradation, leading to periods of unavailability for editorial tools. ## The underlying cause if known The `Broken Links` service shared endpoints with critical editorial tools such as the `Entry Editor` or `Posts Dashboard`. Periodically, this service generated long-running requests, causing health checks to fail and Kubernetes to deem the pods unhealthy. Consequently, Kubernetes terminated these pods and initiated their recreation. This process resulted in temporary unavailability of the affected services during the restart. ## Actions taken & Preventive Measures An independent cluster was established specifically for editorial traffic to safeguard its functionality from potential disruptions caused by other services
  • Time: March 27, 2024, 7:51 p.m.
    Status: Resolved
    Update: This incident has been resolved.
  • Time: March 27, 2024, 6:08 p.m.
    Status: Monitoring
    Update: A fix has been implemented and we are monitoring the results.
  • Time: March 27, 2024, 5:50 p.m.
    Status: Investigating
    Update: We are currently investigating this issue.
  • Time: March 27, 2024, 5:31 p.m.
    Status: Monitoring
    Update: There was an isolated surge of highly suspicious traffic. While we aren't certain of its origin or source, we have isolated it away from the production clusters to its own environment. This means all productions systems should have returned to normal and we believe the problem is under control. We don't fully understand the why behind this yet though so we will be updating this with more details soon.
  • Time: March 27, 2024, 5:02 p.m.
    Status: Investigating
    Update: We've are experiencing the performance degradation for the logged in users.

Updates:

  • Time: March 29, 2024, 9:50 a.m.
    Status: Postmortem
    Update: ## Chronology of the incident: * Mar 27, 2024, 01:02 PM UTC RebelMouse received an alert from internal monitoring systems about slight increased error rate * Mar 27, 2024, 01:07 PM UTC DevOps team checked the systems, noticed a short load spike and error rate already went to normal * Mar 27, 2024, 02:12 PM UTC RebelMouse received an alert from internal monitoring systems about slight increased error rate * Mar 27, 2024, 02:14 PM UTC RebelMouse team members observed a degradation in performance across certain services and promptly reported an incident. However, this degradation in performance was temporary and not permanent. * Mar 27, 2024, 02:22 PM UTC A dedicated incident resolution team was assembled, initiating an investigation * Mar 27, 2024, 02:44 PM UTC Significant anomalies in traffic were identified, prompting the allocation of extra resources to the cluster tasked with handling said traffic. * Mar 27, 2024, 02:53 PM UTC The incident resolution team transitioned into monitoring mode. * Mar 27, 2024, 04:27 PM UTC RebelMouse received an alert from internal monitoring systems about significant increased error rate * Mar 27, 2024, 04:30 PM UTC The incident resolution team decided to fully reroute the suspicious traffic to an independent cluster  * Mar 27, 2024, 04:57 PM UTC The suspicious traffic was isolated in the independent cluster  * Mar 27, 2024, 05:03 PM UTC RebelMouse published the status portal message * Mar 27, 2024, 05:04 PM UTC The incident resolution team shifted into monitoring mode and concurrently began exploring potential enhancements in case of any recurrence of the issue. * Mar 27, 2024, 05:48 PM UTC RebelMouse received reports from the clients about the degradation in performance and also alerts from monitoring systems. * Mar 27, 2024, 05:58 PM UTC The root cause of the problem was identified * Mar 27, 2024, 06:05 PM UTC The fix was implemented * Mar 28, 2024 An independent cluster was established specifically for editorial traffic to safeguard its functionality from potential disruptions caused by other services ## The impact of the incident The incident resulted in intermittent performance degradation, leading to periods of unavailability for editorial tools. ## The underlying cause if known The `Broken Links` service shared endpoints with critical editorial tools such as the `Entry Editor` or `Posts Dashboard`. Periodically, this service generated long-running requests, causing health checks to fail and Kubernetes to deem the pods unhealthy. Consequently, Kubernetes terminated these pods and initiated their recreation. This process resulted in temporary unavailability of the affected services during the restart. ## Actions taken & Preventive Measures An independent cluster was established specifically for editorial traffic to safeguard its functionality from potential disruptions caused by other services
  • Time: March 27, 2024, 7:51 p.m.
    Status: Resolved
    Update: This incident has been resolved.
  • Time: March 27, 2024, 6:08 p.m.
    Status: Monitoring
    Update: A fix has been implemented and we are monitoring the results.
  • Time: March 27, 2024, 5:50 p.m.
    Status: Investigating
    Update: We are currently investigating this issue.
  • Time: March 27, 2024, 5:31 p.m.
    Status: Monitoring
    Update: There was an isolated surge of highly suspicious traffic. While we aren't certain of its origin or source, we have isolated it away from the production clusters to its own environment. This means all productions systems should have returned to normal and we believe the problem is under control. We don't fully understand the why behind this yet though so we will be updating this with more details soon.
  • Time: March 27, 2024, 5:02 p.m.
    Status: Investigating
    Update: We've are experiencing the performance degradation for the logged in users.

Updates:

  • Time: Feb. 12, 2024, 7:41 p.m.
    Status: Postmortem
    Update: ## **Chronology of the incident** Feb 8, 2024, 4:20 PM EST – An increase in error rate was observed. Feb 8, 2024, 4:25 PM EST – Monitoring systems detected anomalies, prompting the RebelMouse team to initiate an investigation. Feb 8, 2024, 5:00 PM EST – Error rates experienced a significant surge. Feb 8, 2024, 5:16 PM EST – The RebelMouse team officially categorized the incident as Major and communicated it through the Status Portal. Feb 8, 2024, 5:30 PM EST – The root cause was pinpointed: unavailability in launching new instances within the EKS cluster. Feb 8, 2024, 6:00 PM EST – The RebelMouse team rectified the issue by updating the network configuration and manually launching required instances to restore system performance. Feb 8, 2024, 8:51 PM EST – RebelMouse initiated a support request regarding AWS services outage. Feb 8, 2024, 9:10 PM EST – Systems reconfiguration was completed, and the team entered monitoring mode. Feb 8, 2024, 10:10 PM EST – The incident was officially resolved. Feb 10, 2024, 2:30 AM EST – AWS confirmed an issue with the EKS service in the us-east-1 region during the specified period, and services have been restored. ## **The impact of the incident** Stores multiple key services hosted on AWS us-east-1 region for RebelMouse were impacted leading to partial unavailability. The underlying cause if known The root cause of this problem has been identified as a networking issue within AWS, specifically affecting the EKS service within the us-east-1 region. AWS acknowledged the issue and the team was actively working on resolving it. ## **Actions taken** RebelMouse engineering teams were engaged as soon as the problem was identified. They worked diligently to resolve the issue in the fastest manner possible while keeping customers updated about the situation. ## **Preventive Measures** We have recognized the importance of enhancing our strategies for handling potential networking issues. Going forward, we will seek opportunities to mitigate these challenges by implementing extensive caching systems and boosting our redundant capacity for caching.
  • Time: Feb. 9, 2024, 3:10 a.m.
    Status: Resolved
    Update: This incident has been resolved.
  • Time: Feb. 9, 2024, 2:12 a.m.
    Status: Monitoring
    Update: We identified the root cause and deployed a fix for it and now we are monitoring application performance
  • Time: Feb. 8, 2024, 11:47 p.m.
    Status: Identified
    Update: We have replaced the last servers and expecting performance to get back to normal in a couple of minutes
  • Time: Feb. 8, 2024, 11:05 p.m.
    Status: Identified
    Update: Newly added servers are functioning correctly and we see an improvement in performance. We are now keep adding new servers and manually removing old one that have issues
  • Time: Feb. 8, 2024, 10:52 p.m.
    Status: Identified
    Update: We are adding new servers manually to increase a capacity to resolve performance degradation
  • Time: Feb. 8, 2024, 10:30 p.m.
    Status: Identified
    Update: We identified that the issue is caused by Kubernetes cluster not being able to launch new instances. We are working on a fix for that right now
  • Time: Feb. 8, 2024, 10:16 p.m.
    Status: Investigating
    Update: We are experiencing a performance degradation. We are investigating what is a root cause of it right now.

Check the status of similar companies and alternatives to RebelMouse

Anaplan
Anaplan

Systems Active

SpotOn
SpotOn

Systems Active

impact.com
impact.com

Systems Active

Replicon
Replicon

Systems Active

Adjust
Adjust

Systems Active

Quantcast
Quantcast

Systems Active

Acoustic
Acoustic

Systems Active

Pantheon Operations
Pantheon Operations

Systems Active

DealerSocket
DealerSocket

Systems Active

Mixpanel
Mixpanel

Systems Active

Invoca
Invoca

Systems Active

Ceros
Ceros

Systems Active

Frequently Asked Questions - RebelMouse

Is there a RebelMouse outage?
The current status of RebelMouse is: Systems Active
Where can I find the official status page of RebelMouse?
The official status page for RebelMouse is here
How can I get notified if RebelMouse is down or experiencing an outage?
To get notified of any status changes to RebelMouse, simply sign up to OutLogger's free monitoring service. OutLogger checks the official status of RebelMouse every few minutes and will notify you of any changes. You can veiw the status of all your cloud vendors in one dashboard. Sign up here
What does RebelMouse do?
RebelMouse is a publishing platform and creative agency that combines product and strategy to drive organic traffic, user growth, loyalty, and revenue success.