Last checked: 3 minutes ago
Get notified about any outages, downtime or incidents for Fathom Analytics and 1800+ other cloud vendors. Monitor 10 companies, for free.
Outage and incident data over the last 30 days for Fathom Analytics.
Join OutLogger to be notified when any of your vendors or the components you use experience an outage. It's completely free and takes less than 2 minutes!
Sign Up NowOutlogger tracks the status of these components for Xero:
Component | Status |
---|---|
Analytics collection (ingest) | Active |
API | Active |
Application/dashboard | Active |
Marketing website | Active |
View the latest incidents for Fathom Analytics and check for official updates:
Description: This incident has been resolved.
Status: Resolved
Impact: Minor | Started At: Oct. 23, 2023, 1:19 p.m.
Description: Thanks for the patience. Ingest remained online throughout (of course) and the dashboard is now back online. We'll leave a maintenance message up on the dashboard as we copy over a missing chunk of data to finalize the migration but the dash & API are back online.
Status: Resolved
Impact: Maintenance | Started At: July 24, 2023, 4:46 a.m.
Description: Amazon Web Services' US-EAST-1 was offline for around 1.5 hours today. Things were completely out of our hands and pageviews were lost. We were unable to do anything during the outage, and had to wait for Amazon to fix things because our us-east infrastructure is the core to everything we do (our EU isolation proxy is separate but, after it removes personal data, it still hits us-east-1). We run Fathom from multiple availability zones and we pay a premium to do that, because we care about availability. Outside of a DDoS attack back in 2020, we've never seen downtime like this where everything (even the pageview collection) was taken down. In this scenario, despite us having multiple availability zones set-up, the entire region's Lambda (our compute) collapsed. We've also identified inadequacies in our status page, where we need to move towards automating updates the minute something goes ofline, so you know that we're aware. In addition, this has brought service availability (something we obsess about) to the top of our priority list and we will be making changes. We had planned to move Fathom's ingest to multiple regions later this year, and we've now bumped up the priority there. We're so sorry for the outage and we'll continue to invest in making our service even more resilient.
Status: Resolved
Impact: None | Started At: June 13, 2023, 4:44 p.m.
Description: Today’s attack was unique because it was completely unintentional. There was a problem with a customer’s site, because they had programmed an infinite loop on their event tracking code. So, what would happen is that a visitor would load their page, and then an event would fire itself at a constantly-high rate until the page was closed. (Making things worse: the page played a popular and very fantastic song that’s 3:08 long, so the page was left open for quite a while by most people.) Now, we’ve hardened our security a lot since we were first DDoS’ed last year, and our firewall routinely blocks similar attacks every week. However, the issue with this incident is that our security was focused on page collection, not event collection. As of now, we've put additional security in front of event collection to prevent this from happening again. Fathom did not go offline, but it did create a backlog. Once we isolated and blocked the offending customer’s event (and had them remove the code from their site), our backlog cleared in less than five minutes. How will this be avoided in the future? We’re migrating to a new database (finished March 12, 2021) that can easily handle things like this, and it will process backlogs like the above much faster. We’ve now added security checks to event as well. If a similar event happened in the future, our software would automatically block offenders (even if their music tastes are quite acceptable). Let us know if you have any questions. We’re always just an email away.
Status: Resolved
Impact: None | Started At: June 7, 2023, 5 p.m.
Description: Last night we performed a migration on the way custom domains work. Historically, any IP blocking you did or allowing of domains was implemented at a network level (on the custom domain itself). We moved this to act as part of our ingest, as we wanted all customers to benefit from it, and we've dropped support for custom domains. With IP blocking, that was simple, we moved the IPs into an array, and those are checked during ingest. Where we faced a problem was with allowed domains. A handful of customers who had entered *.website.com, and not included website.com would've been impacted here. Long story short, our custom domain provider would take *.website.com and allow website.com to come through. This was never expected behaviour, and the majority of customers had entries for both *.website.com and website.com, so the root was tracked. For customers who only had *.website.com, but not website.com added too, your root pageviews weren't tracked between late last night and early this morning. We thought hard about how to backward support this bug, which was a bug, to begin with (since *.website.com should never have matched website.com, it should only have matched abc.website.com, etc.), and we concluded that we'd add website.com to allowed domains whenever someone had used *.website.com. This way, we mimic the old buggy behaviour but, moving forward, we don't have to continue doing this for newly allowed domains. In addition, a bug we deployed that our tests didn't catch was for customers using multi-domains -> unique per domain. This only impacted a tiny number of customers, but we include it here for transparency purposes too. From late last night to early this morning, multiple domains tracked just fine, but website visitors would've been treated as unique across all of those domains vs unique per domain. We've now fixed the tests and deployed a fix. We apologize to the handful of customers that this impacted. The majority of you did have entries for both *.website.com and website.com when you wanted root tracking, but some of you didn't. Moving forward, honestly, we're focusing more on building functionality vs outsourcing it. So here, we blocked IPs and allowed domains via an external provider (our custom domain provider). This meant that we relied on their rules, and we couldn't see the code for that. Moving forward, we will bring more things under our control so that we know how they work.
Status: Resolved
Impact: None | Started At: March 24, 2023, 4:45 p.m.
Join OutLogger to be notified when any of your vendors or the components you use experience an outage or down time. Join for free - no credit card required.