Is there an Firstup outage?

Firstup status: Systems Active

Last checked: 4 minutes ago

Get notified about any outages, downtime or incidents for Firstup and 1800+ other cloud vendors. Monitor 10 companies, for free.

Subscribe for updates

Firstup outages and incidents

Outage and incident data over the last 30 days for Firstup.

There have been 0 outages or incidents for Firstup in the last 30 days.

Severity Breakdown:

Tired of searching for status updates?

Join OutLogger to be notified when any of your vendors or the components you use experience an outage. It's completely free and takes less than 2 minutes!

Sign Up Now

Components and Services Monitored for Firstup

Outlogger tracks the status of these components for Xero:

Component Status

Latest Firstup outages and incidents.

View the latest incidents for Firstup and check for official updates:

Updates:

  • Time: Oct. 9, 2024, 2:47 p.m.
    Status: Postmortem
    Update: ## Summary: On September 30th, 2024, beginning at approximately 1:24 PM PDT \(20:24 UTC\), we started receiving reports of Shortcuts intermittently being unavailable and the Assistant returning an error in the Employee Experience. A platform incident was declared at 2:36 PM PDT \(21:36 UTC\) after initial investigations revealed the issue to be platform-wide. ## Severity: Sev2 ## Scope: Any user on the US platform accessing the Web or Mobile Experiences intermittently experienced missing Shortcuts and/or received an error message while accessing the Assistant. A refresh of the Employee Experience page occasionally restored these endpoints. All other services in the Employee Experience remained available and functional. ## Impact: Shortcuts and the Assistant endpoints in the Employee Experience were intermittently unavailable during the incident. ## Root Cause: The root cause was determined to be due to an uncharacteristically high number of new user integrations introduced within a short period of time that exacerbated a newly uncovered non-optimized content caching behavior. This caused downstream latency and increased error rates served by the web service responsible for rendering shortcuts and the assistant notification page. ## Mitigation: The immediate impact was mitigated by restarting the Employee Experience integrations API, and services were restored by 2:42 PM PDT \(21:42 UTC\). While investigations into the root cause continued, the incident recurred the following day – October 1st, 2024, at 12:54 PM PDT \(19:54 UTC\). The Employee Experience integrations API and the dependent Employee Experience user-integrations request processing service \(Pythia\) were restarted, restoring Shortcuts and the Assistant endpoints by 1:46 PM PDT \(20:46 UTC\). Cache resources for Pythia were increased to mitigate the observed latency. ## Recurrence Prevention: To prevent this incident from recurring, our engineering incident response team: * Has developed a fix to optimize how user-integrations requests use the cache to reduce memory consumption and eliminate latency. * This fix will be released during our scheduled Software Release maintenance window on October 15th, 2024. * Will be adding a monitoring and alerting dashboard for the Employee Experience user-integrations requests processing service \(Pythia\).
  • Time: Oct. 9, 2024, 2:47 p.m.
    Status: Resolved
    Update: Employee Experience Shortcuts and the Assistant have remained available and fully functional throughout the monitoring phase of this incident. This incident is now resolved.
  • Time: Oct. 1, 2024, 9:19 p.m.
    Status: Monitoring
    Update: We have restarted the offending backend service to restore the affected functionalities. Shortcuts and the Assistant are now available. We will place these services back to monitoring for now.
  • Time: Oct. 1, 2024, 8:28 p.m.
    Status: Investigating
    Update: We are currently investigating a recurrence of this issue. We will provide you with an update in 1 hour.
  • Time: Sept. 30, 2024, 9:55 p.m.
    Status: Monitoring
    Update: We have identified and restarted the offending backend services to restore the affected services. Shortcuts and the Assistant are now available. We will place these services under monitoring for now.
  • Time: Sept. 30, 2024, 9:36 p.m.
    Status: Investigating
    Update: We are currently investigating reports of shortcuts in the Employee Experience intermittently being unavailable, as well as an error message being returned while trying to access the assistant. We will provide you with an update in 1 hour.

Updates:

  • Time: Sept. 18, 2024, 12:43 a.m.
    Status: Postmortem
    Update: **Summary:** On September 16th, 2024, starting at around 11:00 AM PDT, we started receiving customer reports stating that the Web and Mobile Experiences endpoints were unavailable. Following a correlation of these reports and system monitors, a platform incident was declared at 11:14 AM PDT. ‌ **Severity:** Sev1 ‌ **Scope:** Any user on the US platform attempting to access the Web and Mobile Experiences intermittently received an error message, and the Employee Experience failed to load. ‌ **Impact:** The core Web and Mobile Experiences platform endpoints were intermittently unavailable for the duration of the incident \(1hr 38mins\). **Root Cause:** The root cause was determined to be an exhaustion of the available database connections due to a sudden burst of user engagement activity that correlated to a small number of high-visibility campaigns. At 10:50 AM PDT, a dependent back-end service entered into a crash loop back-off state due to the database connection requests being refused and returned the error message to end users. ‌ **Mitigation:** The immediate problem was mitigated by fully redeploying the Employee Experience microservice after initial failed attempts at more surgical standardized mitigation maneuvers proved ineffective. Earlier maneuvers focused on reducing database load by temporarily disabling platform features and functionality that make heavy use of database transactions, which reduced error rates overall, but did not eliminate Customer impact. Web and Mobile Experience availability was restored by 12:28 PM PDT.   **Recurrence Prevention:** To prevent this incident from recurring, our engineering incident response team has: * Increased the available database connections by 40% to account for any unforeseen spikes in platform traffic. * Added circuit breakers that would intercept abnormal increases in platform traffic, thereby maintaining platform endpoints availability. * Added an additional incident mitigation maneuver to disable campaign reactions such that a full-service redeploy would not be required to restore platform availability.
  • Time: Sept. 18, 2024, 12:43 a.m.
    Status: Resolved
    Update: All affected endpoints have remained stable and available. This incident is now resolved.
  • Time: Sept. 17, 2024, 6:09 p.m.
    Status: Monitoring
    Update: We are continuing to monitor for any further issues.
  • Time: Sept. 16, 2024, 9:38 p.m.
    Status: Monitoring
    Update: The unplanned performance enhancement maintenance to the Firstup cloud infrastructure is now completed. All services are now available and fully functional. Please notify our Customer Support team if you experience any issues with Firstup services following this notice.
  • Time: Sept. 16, 2024, 8:56 p.m.
    Status: Monitoring
    Update: Today at 2:30 PM PT / 9:30 PM UTC we will be performing unplanned maintenance to shore up Firstup cloud infrastructure as a preventative measure based on technical troubleshooting done since the incident was initially mitigated earlier today. This change may result in a service disruption lasting from a few seconds to several minutes as the changes take effect. We expect to be in a much more stable state as root cause troubleshooting continues following the completion of the maintenance.
  • Time: Sept. 16, 2024, 7:41 p.m.
    Status: Monitoring
    Update: Web and Mobile Experiences have now been restored. We will be placing the offending services under monitoring for now.
  • Time: Sept. 16, 2024, 7:27 p.m.
    Status: Identified
    Update: We are continuing to work on a fix for this issue.
  • Time: Sept. 16, 2024, 7:26 p.m.
    Status: Identified
    Update: We continue to work on relieving the pressure on database resources, and the current user experience is intermittent and partial access to the Employee Experience (on both desktop and mobile EE). Another update in 30 minutes.
  • Time: Sept. 16, 2024, 6:55 p.m.
    Status: Identified
    Update: We are working on relieving pressure on database resources to restore services. Another update in 30 minutes.
  • Time: Sept. 16, 2024, 6:31 p.m.
    Status: Identified
    Update: We have identified a potential cause of this service outage, and are working to restore services. Another update in 30 minutes.
  • Time: Sept. 16, 2024, 6:14 p.m.
    Status: Investigating
    Update: We are currently investigating reports of the US Web Experience being unavailable. Studio remains available

Updates:

  • Time: Sept. 9, 2024, 9:38 p.m.
    Status: Postmortem
    Update: **Summary:** On September 4th, 2024, starting at 4:27 AM EDT, reports of Studio users unable to view or edit campaigns in Studio were received. Following a correlation of customer reports and initial troubleshooting, a platform service degradation incident was declared at 9:21 AM EDT, and published on our Status Page at 9:41 AM EDT. **Scope:** The scope of this service degradation was restricted to Studio users with multiple audiences assigned to them. **Impact:** Studio users who had multiple audiences assigned to them were unable to view or edit campaigns during the duration of this incident \(12hrs 46mins\). No scheduled campaigns were affected, and campaign viewing, editing, and publishing processes were not inhibited for other users with no audiences assigned or had a single audience assigned. **Root Cause:** The root cause of this incident was determined to be a regression to a misconfigured platform enhancement policy change intended to improve the efficiency of how user-assigned audiences were queried, which had been released at 12:07 AM EDT as part of the scheduled software release maintenance the same day. **Mitigation:** A rollback of the offending policy change was performed and completed by 12:53 PM EDT to restore access to Studio campaigns. **Recurrence Prevention:** To prevent this incident from recurring, we will perform the following actions before releasing the platform enhancement policy change in the future: * Review and correct any misconfiguration on the platform enhancement policy change code. * Add more unit test cases to cover multiple audiences on the modified queries. * Add more regression test cases to cover users with multiple audiences.
  • Time: Sept. 9, 2024, 9:38 p.m.
    Status: Resolved
    Update: This incident has been resolved.
  • Time: Sept. 4, 2024, 5:01 p.m.
    Status: Monitoring
    Update: The proposed fix has successfully been deployed in the production environment. Please notify our Customer Support team if any issues persist. We will now be placing the affected services under monitoring for now.
  • Time: Sept. 4, 2024, 4:09 p.m.
    Status: Identified
    Update: We are currently deploying the proposed fix in the production environment and will provide another update once this is completed.
  • Time: Sept. 4, 2024, 3:10 p.m.
    Status: Identified
    Update: We are currently validating the proposed fix for this issue in a staging environment and will deploy it in the production environment upon successful testing. Another update within 1 hour.
  • Time: Sept. 4, 2024, 2:21 p.m.
    Status: Identified
    Update: We have identified a potential cause of the service disruption, and are working on mitigating this issue. Another update within 1 hour.
  • Time: Sept. 4, 2024, 1:41 p.m.
    Status: Investigating
    Update: We are currently investigating reports where some users are unable to view or edit campaigns they have access to. We will provide you with an update within 1 hour.

Updates:

  • Time: Sept. 5, 2024, 10:38 p.m.
    Status: Postmortem
    Update: **Summary:** On August 28th, 2024, at 1:03 PM EDT, system monitors alerted us of failing database health checks, and our engineering team immediately started investigating these alerts. Customer reports of core platform endpoints being unresponsive and/or returning error messages were received beginning at 1:12 PM EDT, and a platform incident was declared at 1:21 PM EDT. **Scope:** Any user on the US platform attempting to access or navigate through the Web and Mobile Experience, as well as Studio, was impacted by this incident. **Impact:** Core US platform endpoints such as Web and Mobile Experiences, as well as Studio, were slow to load or became intermittently unavailable for the duration of the incident \(48 minutes\). **Root Cause:** The root cause was determined to be a slow-running query for “user unread posts” that saw a huge spike in traffic following a campaign that was published to a large audience. As a result, the database CPU spiked and stopped taking new connection requests, causing new Web and Mobile Experience requests, as well as Studio requests to fail and the system appeared to be unresponsive. **Mitigation:** The immediate problem was mitigated by reducing the number of pods submitting requests to the database by half to alleviate the load on the database, which restored database responsiveness and platform endpoints availability by 1:51 PM EDT. **Recurrence Prevention:** To prevent this incident from recurring, our engineering incident response team has optimized the offending “slow-running” query to perform 2x faster, thereby reducing the required database CPU resources. We are also working on implementing circuit breakers on the offending downstream services from the database, to prevent database CPU overutilization, to ensure platform endpoints availability.
  • Time: Sept. 5, 2024, 10:37 p.m.
    Status: Resolved
    Update: This incident is now resolved.
  • Time: Aug. 28, 2024, 8:06 p.m.
    Status: Monitoring
    Update: Moving platform incident in to a monitoring state. There has been no further recurrence of the service disruption to the web experience endpoint. A software hot fix has been deployed and verified. This fix is intended to address the suspected root cause of a non-optimal database query that resulted in unresponsiveness and 500 error responses observed by users prior to the incident being mitigated. All components remain fully operational.
  • Time: Aug. 28, 2024, 6:18 p.m.
    Status: Identified
    Update: We have identified the cause of this service outage, and are working on a fix. However, Web Experience and Studio continue to be available. Another update will be provided as more information is made available.
  • Time: Aug. 28, 2024, 5:51 p.m.
    Status: Investigating
    Update: As we continue investigating this incident, we have relieved some pressure on back-end resources and services to mitigate the issue, and Web Experience and Studio are now available. Another update in 30 minutes.
  • Time: Aug. 28, 2024, 5:21 p.m.
    Status: Investigating
    Update: We are currently investigating reports of the US Web Experince being unavailable. We will provide you with an update in 30 minutes.

Updates:

  • Time: Aug. 30, 2024, 7:26 p.m.
    Status: Postmortem
    Update: **Summary:**  From approximately 11:08 am - 11:38 am PT \(18:08 pm - 18:38 pm UTC\), Thursday August 22nd, both Studio and Web Experience were unavailable due to the release of Version 2 of Personalized Fields \(PFV2\), a new feature with the Q3 quarterly update that was more resource intensive than initially planned. This caused high CPU usage, increased query latency and database connection pool exhaustion.  ‌ **Impact:**  The scope of this incident primarily affected users who attempted to access Studio services and Web Experience between 11:08 am - 11:38 am PT. The issue manifested itself in the following observable ways through below errors on the frontend of the platform: * We’re sorry, but something went wrong.  * 502 Bad Gateway.  * There was an error processing your request. Please try again. ‌ **Root Cause:**   The root cause was determined to be the release of Version 2 of Personalized Fields \(PFV2\), a new feature with the Q3 quarterly update that has been more resource intensive than was initially planned. The feature caused a significant increase in CPU usage, query latency on the shared database cluster and database connection pool exhaustion. This resulted in the Studio/Web Experience service unavailability and error messages observed by impacted users.  ‌ **Mitigation:**  The immediate impact was mitigated by temporarily disabling the newly released feature that was causing excessive resource consumption. The cache Time-To-Live \(TTL\) was also changed from 1 minute to 3 hours to reduce load and stabilize performance.  After service was restored, we conducted platform tuning and scaled up infrastructure outside business hours to accommodate the increased load with the introduction of this new feature. ‌ **Recurrence Prevention:**  To prevent a recurrence of this incident, the below actions have or are being implemented: * Load Testing and Analysis - More rigorous load testing and analysis to detect N\+1 calls or latency spikes before a feature goes live. * Infrastructure Planning and Caching Strategy - Refactor the caching for the affected feature, including pre-warming caches in batches to prevent cache-miss cascades, and optimizing the infrastructure to handle increased load efficiently whilst only caching what is needed.  * Remove custom attributes for blocked users who have been inactive for a specific period to reduce table size and improve query performance. * Feature Flagging and Gradual Rollouts - Future high-risk changes will be rolled out gradually and improved resource monitoring performance will be done before full deployment.
  • Time: Aug. 23, 2024, 3:35 p.m.
    Status: Resolved
    Update: A fix was put in place and the service disruption to the platform has been resolved. Thank you for your patience whilst we carried out our investigation. Please contact Support at support.firstup.io should you encounter any further issues.
  • Time: Aug. 22, 2024, 6:39 p.m.
    Status: Identified
    Update: We are continuing to work on a fix for this issue.
  • Time: Aug. 22, 2024, 6:39 p.m.
    Status: Identified
    Update: Both Studio and Web Experience are now available. The component which is causing the issue has been identified and steps have been taken to mitigate. Further updates to follow.
  • Time: Aug. 22, 2024, 6:27 p.m.
    Status: Investigating
    Update: We are urgently investigating an issue with intermittent unavailability of the platform being experienced for both Studio and Web Experience. Updates to follow asap.

Check the status of similar companies and alternatives to Firstup

Akamai
Akamai

Systems Active

Nutanix
Nutanix

Systems Active

MongoDB
MongoDB

Systems Active

LogicMonitor
LogicMonitor

Issues Detected

Acquia
Acquia

Systems Active

Granicus System
Granicus System

Systems Active

CareCloud
CareCloud

Systems Active

Redis
Redis

Systems Active

integrator.io
integrator.io

Systems Active

NinjaOne Trust

Systems Active

Pantheon Operations
Pantheon Operations

Systems Active

Securiti US
Securiti US

Systems Active

Frequently Asked Questions - Firstup

Is there a Firstup outage?
The current status of Firstup is: Systems Active
Where can I find the official status page of Firstup?
The official status page for Firstup is here
How can I get notified if Firstup is down or experiencing an outage?
To get notified of any status changes to Firstup, simply sign up to OutLogger's free monitoring service. OutLogger checks the official status of Firstup every few minutes and will notify you of any changes. You can veiw the status of all your cloud vendors in one dashboard. Sign up here