Is there an Firstup outage?

Firstup status: Systems Active

Last checked: 7 minutes ago

Get notified about any outages, downtime or incidents for Firstup and 1800+ other cloud vendors. Monitor 10 companies, for free.

Subscribe for updates

Firstup outages and incidents

Outage and incident data over the last 30 days for Firstup.

There have been 2 outages or incidents for Firstup in the last 30 days.

Severity Breakdown:

Tired of searching for status updates?

Join OutLogger to be notified when any of your vendors or the components you use experience an outage. It's completely free and takes less than 2 minutes!

Sign Up Now

Components and Services Monitored for Firstup

Outlogger tracks the status of these components for Xero:

Component Status

Latest Firstup outages and incidents.

View the latest incidents for Firstup and check for official updates:

Updates:

  • Time: May 13, 2024, 10:13 p.m.
    Status: Postmortem
    Update: ## **Summary:** On April 22nd, 2024, at 6:15 AM PT \(13:15 UTC\) we began receiving reports of scheduled campaigns experiencing delays or which had not been published at all.  Two sources were identified that lead to the delays, and were subsequently addressed in two separate hotfixes. ## **Impact:** Impact was most visible in campaign reporting delivery metrics showing that campaigns had either not gone out at the expected time, or email deliveries themselves arrived well after the scheduled time.  Not all campaigns were affected, and actual delays ranged from several minutes up to an hour or longer in a small number of instances. ## **Root Cause:** Root Cause was determined to be related to a scheduled database upgrade performed on April 19th which resulted in degraded performance characteristics of the scheduling service.  There were two underlying observable symptoms:   1. On April 22nd, the actual delivery of some emails was slower than expected as a result of several database queries that were not optimized for the new database software version deployed on April 19th.  These queries ran slower after the upgrade when under higher load levels than what had been initially tested against. 2. The number of scheduled campaigns not executing at the precise scheduled time increased dramatically, also following the database upgrade, as a result of several newly uncovered bugs in the scheduling service itself. ## **Mitigation:** A number of mitigation measures were put into place to address different aspects of this platform incident over the course of several days. * The database query optimizations were deployed in a hotfix on April 22nd at 4:30 PM PT \(23:30 UTC\).  This was specifically aimed at addressing the email delivery slowness issue. * For Customers who opened support tickets related to specific scheduled campaigns being delayed, those campaigns were manually published as a part of the individual support tickets.  Also, a separate query was run on an as-needed basis to proactively identify other campaigns in a similar state, and manually publish those as well. * A second hotfix was deployed on April 24th at 11:30 AM PT \(18:30 UTC\) to add an automated backstop measure to catch and publish any campaigns that had been scheduled at an earlier time but had not actually started. ## **Recurrence Prevention:** The following actions have been committed to fully resolving the incident and eliminating the reliance on the mitigation measure currently in place. * Create improved platform alerting for campaign delivery times to identify and address degraded state earlier. * Fix remaining 3 bugs uncovered during the incident investigation process as well as making the scheduler service itself more robust.
  • Time: May 13, 2024, 10:12 p.m.
    Status: Resolved
    Update: Marking incident as resolved and all components fully operational. Automated mitigation measure has been demonstrated to be effective while remaining recurrence prevention items work their way through the system.
  • Time: April 24, 2024, 5:29 p.m.
    Status: Monitoring
    Update: We continue to monitor the services that were impacted for any further issues.
  • Time: April 23, 2024, 2:06 a.m.
    Status: Monitoring
    Update: We have deployed a hotfix to address the database performance issue as it relates to the campaign email delivery queue. All affected services remain fully stable and available. We will be placing these services under monitoring for now.
  • Time: April 22, 2024, 7:38 p.m.
    Status: Identified
    Update: We have identified a database performance issue, and are working to address it. The email delivery pipeline queue was backed up, resulting in campaign email deliveries being delayed. This queue has since caught up, and campaign emails are now being delivered as expected. We will provide another update as soon as more information is made available.
  • Time: April 22, 2024, 7:01 p.m.
    Status: Investigating
    Update: We continue to investigate the delays in campaign email deliveries. Another update in 1 hour.
  • Time: April 22, 2024, 5:50 p.m.
    Status: Investigating
    Update: We continue to investigate the delays in campaign email deliveries. We have observed that the campaigns are publishing as expected, and therefore no need to republish them. Another update within 1 hour.
  • Time: April 22, 2024, 4:53 p.m.
    Status: Investigating
    Update: We are currently investigating reports of delayed campaign email deliveries and associated reporting. We will provide you with an update within 1 hour.

Updates:

  • Time: June 7, 2024, 8:08 p.m.
    Status: Postmortem
    Update: ### **Summary:** On Tuesday April 16th, 2024, starting at approximately 9:54 AM UTC to 11:09 AM UTC, EU Studio experienced multiple service disruptions including general slowness with loading Studio functions, issues with login as well as HTTP 500 system error messages. It was identified that a number of backend services were experiencing TCP \(Transmission Control Protocol\) networking issues that manifested as a variety of user-visible errors and unpredictable product interactions. ###  **Impact:**  Affected users were unable to login into Studio, as well as experienced general slowness and system error messages such as “504 Gateway Timeout” or “502 Bad Gateway” due to the backend services having network errors.  ### **Root Cause:**  The root cause was determined to be an unexpected spike in traffic which caused a number of nodes \(worker machines\) to rapidly increase to handle the additional workload. This led to DNS \(Domain Name Service\) request timeouts as it exceeded the overall capacity for inbound DNS traffic when these nodes increased.   ### **Mitigation:** The immediate problem was mitigated by increasing DNS capacity within the EU infrastructure and restarting the affected services, restoring system services and performance by 11:09 AM UTC. ### **Recurrence Prevention:** Below changes have been implemented to prevent unexpected loss of DNS service capacity.  ‌ * An alert will now fire within the EU infrastructure any time the internal DNS capacity drops below the minimal viable threshold determined by Site Reliability Engineering. * Load testing has been performed to ensure scalability and appropriate buffer for potential spikes and organic growth in DNS request volume.
  • Time: May 2, 2024, 3:56 p.m.
    Status: Resolved
    Update: Studio has remained fully accessible for EU communities following the applied fix. This platform service degradation is now resolved, and an RCA will be provided once a full incident postmortem has been completed.
  • Time: April 16, 2024, 11:11 a.m.
    Status: Monitoring
    Update: We have applied a fix for the issue affecting Studio on EU communities. We are continuing to monitor and will update once we have confirmed that the platform is stable.
  • Time: April 16, 2024, 10:51 a.m.
    Status: Investigating
    Update: We are continuing to investigate this issue affecting Studio for EU communities and working to restore service. We'll provide another update within the next 30 minutes.
  • Time: April 16, 2024, 10:04 a.m.
    Status: Investigating
    Update: We are investigating a service disruption affecting Studio for EU communities. These appear to be intermittent issues causing some users to be unable to login to Studio, or experiencing slowness/timeouts. Our next update will be in 30 minutes.

Updates:

  • Time: April 10, 2024, 7:46 p.m.
    Status: Postmortem
    Update: **Summary:** On March 15th, 2024, we started receiving reports where scheduled campaigns experienced delays in publishing at the scheduled time or did not publish at all at the scheduled time. **Impact:** The impact was restricted to any scheduled campaigns on the FirstUp platform scheduled to publish on March 15th, 2024, between 1:00 AM ET \(05:00 UTC\) and 8:04 PM ET \(March 16th, 2024 - 00:04 UTC\). **Root Cause:** The root cause was determined to be a regression to a software change to the “scheduled campaign callback service” that was deployed during our scheduled software release window the previous day causing a callback to the “scheduling service” \(to publish a scheduled campaign at the scheduled time\) to fail. **Mitigation:** A hotfix was deployed by 8:04 PM ET \(March 16th, 2024 - 00:04 UTC\) to address the software regression introduced in the campaign scheduling software. Any delayed scheduled campaigns were also manually published by the same time. **Recurrence Prevention:** The Incident Response Team has taken the following actions in an effort to prevent a recurrence of this incident: * Implemented additional pre-release regression testing around the “scheduling service”. * Documented the SQL rake task used to identify any failed/delayed scheduled campaigns in a runbook to aid in quickly mitigating any future similar incidents. * Created monitors to alert us on the first instance of a failed/delayed scheduled campaign to enable us to proactively get ahead of any campaign scheduling issue\(s\) and prevent similar platform-wide incidents.
  • Time: March 19, 2024, 7:58 p.m.
    Status: Resolved
    Update: This incident has been fully resolved and all components remain fully operational.
  • Time: March 16, 2024, 12:15 a.m.
    Status: Monitoring
    Update: A fix has been developed and deployed to mitigate this service degradation. We have also manually published any impacted scheduled campaigns to this point, if they were not duplicated or manually published by the customer. We will place the affected services under monitoring for now.
  • Time: March 15, 2024, 11:23 p.m.
    Status: Identified
    Update: We continue to work on a solution to the potential root cause of this service degradation. We have also manually published any impacted scheduled campaigns to this point, if they were not duplicated or manually published by the customer. Another update will be provided within 1 hour.
  • Time: March 15, 2024, 10:02 p.m.
    Status: Identified
    Update: We continue to work on a solution to the potential root cause of this service degradation. We have also manually published any impacted scheduled campaigns to this point, if they were not duplicated or manually published by the customer. Another update will be provided within 1 hour.
  • Time: March 15, 2024, 9:22 p.m.
    Status: Identified
    Update: We have identified a potential backend issue that may be the root cause of this service degradation, and are working to resolve it. Another update will be provided within 1 hour.
  • Time: March 15, 2024, 9:15 p.m.
    Status: Investigating
    Update: We continue to investigate the cause of this service degradation, and will provide another update within 1 hour.
  • Time: March 15, 2024, 8:19 p.m.
    Status: Investigating
    Update: We are currently investigating reports where some scheduled campaigns did not publish at the scheduled time or were delayed in publishing. We will provide an update within 1 hour.

Updates:

  • Time: March 21, 2024, 4:09 p.m.
    Status: Postmortem
    Update: ## Summary: On February 28th, 2024, starting at around 1:11 PM PT \(18:11 UTC\), we started receiving reports that some users had not received an email from a scheduled campaign, and subsequently additional reports on February 29th, 2024, where some scheduled campaigns were still showing in the scheduled folder in Studio past their scheduled publish time. ## Impact: Impact was primarily related to campaigns that were scheduled to publish between 02.28.2024 at 11:16 AM ET and 02.29.2024 at 1:06 PM ET. ## Root Cause: The root cause was determined to be memory exhaustion in our core database on 02.28.2024 at 11:16 AM ET, which triggered an automatic database failover by AWS infrastructure failure service. Post-failover, dependent services that manage scheduled campaigns did not automatically reconnect to the failover database, and therefore could not initiate a “publish” event for scheduled campaigns at the scheduled time.  ## Mitigation: The immediate problem was mitigated by querying the database for past-due scheduled campaigns and manually publishing them. Additionally, the services responsible for scheduled campaigns were manually restarted to establish connections to the failover database, in effect allowing them to initiate “publish” events for scheduled campaigns as expected.  ## Recurrence Prevention: An incident response team post-mortem meeting revealed the following as recurrence prevention measures to be taken: ●      Removal of SQL comments to reduce database memory consumption. ●      Increase database instance size by upgrading the Postgres version. ●      Improve monitoring and alerting on database connections and memory usage using dedicated dashboards that include links to runbooks and mitigation instructions. ●      Fix failover and error handling in the affected services.
  • Time: March 7, 2024, 5:35 p.m.
    Status: Resolved
    Update: This service degradation is now considered as resolved, and all impacted services have remained available and stable.
  • Time: Feb. 29, 2024, 6:08 p.m.
    Status: Monitoring
    Update: We have identified a potential issue that caused some scheduled campaigns not to publish at the scheduled time. This only affected campaigns that were scheduled at a specific moment in time, and those campaigns have manually been published. Any campaigns scheduled to publish from now on, should not experience any issues, and should publish at the scheduled time. We will provide additional details in our postmortem to this service degradation. This incident is now considered mitigated.
  • Time: Feb. 29, 2024, 5:31 p.m.
    Status: Investigating
    Update: We continue to investigate the cause of this service degradation, and will provide another update within 1 hour.
  • Time: Feb. 29, 2024, 4:34 p.m.
    Status: Investigating
    Update: We have manually published any scheduled campaigns if they were scheduled on or after 2/28/2024, but did not publish at the expected time. We continue to investigate the cause of this service degradation, and will provide another update within 1 hour.
  • Time: Feb. 29, 2024, 4:11 p.m.
    Status: Investigating
    Update: We are currently investigating reports where some scheduled campaigns did not publish at the schedule time. We will provide an update within 1 hour.

Updates:

  • Time: March 15, 2024, 5:41 p.m.
    Status: Postmortem
    Update: ## Summary: On February 15th, 2024, beginning at approximately 5:50 AM PT \(13:50 UTC\), we started receiving reports of several platform services being unavailable, including Microapps and Partner APIs.  Errors persisted intermittently for just over an hour primarily for these two services as well as any new user requests that required IP address resolution through an authoritative DNS \(domain name service\) server. ## Impact: The impact was primarily related to services which have very low TTL \(time to live\) thresholds for DNS and new end-user requests that required a new DNS lookup first. Observed error conditions included request timeouts and HTTP 500 gateway errors.  Multiple services were in scope of the platform incident and availability would have depended on whether the service IP had been cached locally or whether the DNS request was able to be serviced within the lower level of available capacity. ## Root Cause: The root cause was determined to be an unexpected drop in overall DNS service capacity.  An earlier planned maintenance regressed an earlier performance improvement that resulted in the reduction of the number of Core DNS services that would run in production, thus limiting the overall available capacity for inbound DNS traffic. ## Mitigation: The immediate problem was mitigated by restoring Core DNS capacity as soon as the discrepancy was discovered at 6:30 AM PT \(14:30 UTC\) by the incident response team.  Remaining error rates began to improve markedly by 6:45 AM PT \(14:45 UTC\) and all services were confirmed to be fully stabilized by 7:15 AM PT \(15:15 UTC\). ## Recurrence Prevention: A technical team postmortem meeting reviewed the change management process that allowed an errant default setting for the number of DNS nodes to be pushed to production, how to improve platform alert visibility of this condition in the future, and how to prevent unexpected loss of DNS service capacity.  The following changes have since been instituted: ‌ * An alert will now fire any time the core DNS capacity drops below the minimal viable threshold determined by Site Reliability Engineering. * All core service nodes will now launch with an attached DNS service component automatically. * Load testing has been performed to ensure scalability and appropriate buffer for potential spikes and organic growth in DNS request volume. * Updated infrastructure change management to ensure that any future configuration changes would persist following service restarts.
  • Time: Feb. 20, 2024, 5:47 p.m.
    Status: Resolved
    Update: This incident has been resolved.
  • Time: Feb. 15, 2024, 3:21 p.m.
    Status: Monitoring
    Update: All affected services have now been restored and are confirmed as available. We will now be placing these services under monitoring.
  • Time: Feb. 15, 2024, 2:48 p.m.
    Status: Identified
    Update: We identified a potential issue with the capacity of our core DNS. We have increased the number POD's to service the traffic level, and are seeing indication that performance is trending to recovering. Another update will be provided within 1 hour.
  • Time: Feb. 15, 2024, 2:28 p.m.
    Status: Investigating
    Update: We are investigating reports of Microapps and Partner APIs being unavailable. We will provide an update within 1 hour.

Check the status of similar companies and alternatives to Firstup

Akamai
Akamai

Systems Active

Nutanix
Nutanix

Systems Active

MongoDB
MongoDB

Systems Active

LogicMonitor
LogicMonitor

Systems Active

Acquia
Acquia

Systems Active

Granicus System
Granicus System

Systems Active

CareCloud
CareCloud

Systems Active

Redis
Redis

Systems Active

integrator.io
integrator.io

Systems Active

NinjaOne Trust

Systems Active

Pantheon Operations
Pantheon Operations

Systems Active

Securiti US
Securiti US

Systems Active

Frequently Asked Questions - Firstup

Is there a Firstup outage?
The current status of Firstup is: Systems Active
Where can I find the official status page of Firstup?
The official status page for Firstup is here
How can I get notified if Firstup is down or experiencing an outage?
To get notified of any status changes to Firstup, simply sign up to OutLogger's free monitoring service. OutLogger checks the official status of Firstup every few minutes and will notify you of any changes. You can veiw the status of all your cloud vendors in one dashboard. Sign up here