Last checked: 9 minutes ago
Get notified about any outages, downtime or incidents for Firstup and 1800+ other cloud vendors. Monitor 10 companies, for free.
Outage and incident data over the last 30 days for Firstup.
Join OutLogger to be notified when any of your vendors or the components you use experience an outage. It's completely free and takes less than 2 minutes!
Sign Up NowOutlogger tracks the status of these components for Xero:
Component | Status |
---|
View the latest incidents for Firstup and check for official updates:
Description: On 06/14/2023 starting at 8:00 AM PT to 3:15 PM PT, our Firstup alert monitoring system promptly notified our engineering team about campaigns queuing in the system. Our engineering team identified a spike in our campaign engine planner, caused by a data task aimed at improving content rendering, inadvertently led to delays across the platform. To minimize future impact, safeguards and policies have been implemented, including running these tasks during off-peak hours. Regular monitoring, testing, and documentation will be conducted to prevent similar issues and ensure optimal platform performance.
Status: Postmortem
Impact: None | Started At: June 14, 2023, 5 p.m.
Description: Our upstream vendor - AWS - has provided the following root cause analysis to the event on their platform that caused multiple dependent services to become degraded or unavailable on Firstup platform on Tuesday, June 13th 2023. More specifically, the Firstup platform leverages the AWS Lambda service to perform routine functions that include: * Scanning attachments for viruses * Finding program logins using community codes * Calculating user activity metrics * Processing user sync files Additionally, we observed that the AWS Mediaconvert service was impaired during this time which impacted Firstup Customers’ ability to upload video files. This vendor specific RCA is being included for reference purposes only and is included inline below: _“Between 11:49 AM PDT and 3:37 PM PDT, we experienced increased error rates and latencies for multiple AWS Services in the US-EAST-1 Region. Our engineering teams were immediately engaged and began investigating. We quickly narrowed down the root cause to be an issue with a subsystem responsible for capacity management for AWS Lambda, which caused errors directly for customers \(including through API Gateway\) and indirectly through the use of other AWS services. Additionally, customers may have experienced authentication or sign-in errors when using the AWS Management Console, or authenticating through Cognito or IAM STS. Customers may also have experienced issues when attempting to initiate a Call or Chat to AWS Support. As of 2:47 PM PDT, the issue initiating calls and chats to AWS Support was resolved. By 1:41 PM PDT, the underlying issue with the subsystem responsible for AWS Lambda was resolved. At that time, we began processing the backlog of asynchronous Lambda invocations that accumulated during the event, including invocations from other AWS services. As of 3:37 PM PDT, the backlog was fully processed. The issue has been resolved and all AWS Services are operating normally.”_ Please see [here](https://health.aws.amazon.com/health/status) for more information.
Status: Postmortem
Impact: None | Started At: June 13, 2023, 8:12 p.m.
Description: On 6/1/2023 at 10:55 AM PST, Firstup platform had an increase in commenting on posts which put stress on our database. Due to a sudden influx of a high volume of comments a safety mechanism was automatically triggered which limited additional comments from being able to post. This mechanism was put in place from platform incident on 2/2/2023 to reduce extra extraneous load on the database caching service in order to prevent impact from spreading to other \(non-commenting related\) micro services. We have implemented a performance enhancement by removing a bottleneck in the process of users submitting comments.
Status: Postmortem
Impact: None | Started At: June 1, 2023, 6:06 p.m.
Description: On May 31st 2023, at 10:31 AM Pacific Time, a configuration change intended to add flexibility in video display scaling resulted in prior videos to not be playable for Safari and iOS users due to a change in the naming convention of the stored video assets. On June 1st 2023, at 1:21 PM Pacific Time, a fix was developed and deployed, and prior videos were backfilled by 2:10 PM Pacific Time to utilize the correct naming convention, allowing all videos to resume playing as expected. Our QA testing has been revamped to include testing video assets and their use cases in the platform before any configuration change deployment, in an effort to prevent this scenario from happening again.
Status: Postmortem
Impact: None | Started At: June 1, 2023, 4:32 p.m.
Description: On May 22nd 2023, one of our caching servers received an unexpected amount of load. This caused the depending services to be impacted, cascading to other services as well. As mitigation step, we added a circuit breaker in order to limit the scope of impact on dependent services, as well as increased the memory in our caching server to promptly restore services. As a long term preventative measure, we have increased the time-to-live for cashed data, which reduces the frequency of calls to the caching server, hence reducing the load on the server.
Status: Postmortem
Impact: None | Started At: May 22, 2023, 4:35 p.m.
Join OutLogger to be notified when any of your vendors or the components you use experience an outage or down time. Join for free - no credit card required.