Company Logo

Is there an Field Nation outage?

Field Nation status: Systems Active

Last checked: 2 minutes ago

Get notified about any outages, downtime or incidents for Field Nation and 1800+ other cloud vendors. Monitor 10 companies, for free.

Subscribe for updates

Field Nation outages and incidents

Outage and incident data over the last 30 days for Field Nation.

There have been 1 outages or incidents for Field Nation in the last 30 days.

Severity Breakdown:

Tired of searching for status updates?

Join OutLogger to be notified when any of your vendors or the components you use experience an outage. It's completely free and takes less than 2 minutes!

Sign Up Now

Components and Services Monitored for Field Nation

Outlogger tracks the status of these components for Xero:

API Active
Marketing Website Active
Mobile App Active
Out of the box Integration connectors Active
Web App Active
Component Status
API Active
Marketing Website Active
Mobile App Active
Out of the box Integration connectors Active
Web App Active

Latest Field Nation outages and incidents.

View the latest incidents for Field Nation and check for official updates:

Updates:

  • Time: Jan. 17, 2023, 11:28 p.m.
    Status: Resolved
    Update: This incident has been resolved.
  • Time: Jan. 17, 2023, 11:06 p.m.
    Status: Monitoring
    Update: We've tracked down the performance issue and have mitigated the problem. We're working on a longer term solution to remediate the underlying issue. We will continue to monitor the platform to ensure our current solution is working as expected.
  • Time: Jan. 17, 2023, 10:33 p.m.
    Status: Identified
    Update: We've identified a performance issue with some queries against our core database cluster. We're currently working to address this performance.
  • Time: Jan. 17, 2023, 9:58 p.m.
    Status: Investigating
    Update: We have received reports and are observing degraded performance and intermittent request failures in the platform. We're currently investigating the issue.

Updates:

  • Time: Jan. 12, 2023, 4:15 p.m.
    Status: Postmortem
    Update: #### Summary A failure of a central message queueing service, responsible for delivering event messages across application services that make up our system, caused a widespread outage of the Field Nation platform. This outage extended a significant amount of time at 3.5 hours. While our team believed to have mitigated the extent of the impact, they learned a significant portion of time into the incident that the mitigation was less effective than initially believed. We’re sincerely sorry for the disruption and the length of the duration. We understand the importance Field Nation plays for our service providers and buyers in how they get work done, and work hard to ensure technical issues don’t get in the way of that. The incident was caused by a single degraded node. We are implementing additional monitors to more quickly identify such a problem in the future, and also improving our internal operating documentation to give better guidance for much quicker resolution to this nature of system failure in the future. #### What Happened On Jan 9th at 09:17 we were alerted to a health issue with one of the three nodes that make up our central message queueing service. The alert notified us that it did not reply to an automated health check and was likely down. This is a service that enables application services that make up our platform to be aware of events occurring within the platform as well as queue work to be background processed. Upon our team’s investigation into this alert, they were unable to determine an issue and the node appeared to be functioning with healthy metrics. The alert was resolved an hour later and the team assumed it to be a false alarm. At 10:30 a routine deployment was made for a minor update to one of our application services. 15 minutes later at 10:45, our team received an alarm about high memory usage on the same message queueing service node that earlier had the failed health check. The team then observed some key application services were reporting as unhealthy and the platform website was no longer loading. The team decided at 11:00 to roll back the changes deployed at 10:30. Although there was nothing in the changes that related to the issue, the timing of the change appeared to correlate. After rolling back those changes there was no sign of improvement. It was then observed the message queueing service was blocking connections from our application services. At 11:15 we decided it would be most helpful to mitigate some impact and put the platform in a partially operational state by disabling the use of the message queueing service entirely. This would lose some key functionality that requires this service such as report generation, routing work to providers, integration updates, and notification delivery. It would, however, gain back a majority of the platform's operation. This change was made at 11:20 and the team confirmed the ability to load the website where it previously was not. After disabling the use of our message queueing service, it was observed that the message queueing service still listed connections from our application services even though the services were no longer connecting. The team then started work to close these orphaned connections in hopes this would restore the health of the node. Due to the number of connections, the team had to spend time developing a script to perform this bulk connection close operation. This effort encountered complications and wasn’t able to be executed until 12:40, fully completing the connection closing at 13:05. At 13:13 our support staff made our response team aware that with the message queueing disabled, we actually were in a significantly less functional state than we thought as no work order pages were loading. This was unexpected and meant the team was operating off an assumption the impact was more mitigated than it actually was. After closing connections we re-enabled the use of the message queueing service at 13:37, suspecting we’d resolved the issue with the bad node. Upon re-enabling we, unfortunately, saw early indicators the message service was still not functioning properly. We then re-disabled its use. At 14:20 we decided to restart a node in the three-node cluster. We then once again tried re-enabling the use of the message queueing service at 14:25. This now operated correctly and we resolved the incident at 14:49 confident the issue was resolved. #### Future Prevention and Process Improvement After the restoration of services, the team worked to identify the root cause of the issue. While we’ve spent a lot of time researching metrics of the message queueing service around the time the incident started, we’ve so far been unable to determine a correlation to any other metric or event. Our team is continuing to research, but believe this may be a fluke occurrence on a single node of the cluster. We have established additional monitoring alerts that can clue us in earlier to signs a possible node degradation. Ultimately, when reviewing this incident, the final action taken to gain resolution is one that we realize could have been attempted much earlier. For quicker action, we are establishing standard operating procedures for safely dealing with an unhealthy node in the cluster. Had we had the confidence to take this action sooner, we would have significantly cut down on the length of this outage.
  • Time: Jan. 9, 2023, 8:49 p.m.
    Status: Resolved
    Update: This incident has been resolved.
  • Time: Jan. 9, 2023, 8:39 p.m.
    Status: Monitoring
    Update: We've been able to get the event messaging system in a healthy state and have been monitoring normal platform behavior. We will continue to monitor but as of now things are once again operational.
  • Time: Jan. 9, 2023, 8:26 p.m.
    Status: Identified
    Update: We are continuing to work through the issue. The core problem is an unhealthy event messaging system that our platform relies on. We have two parallel operations underway: 1.) Implement temporary solution to enable more platform functionality without our event messaging system. 2.) Bring the event messaging system to a health state so we can get to complete resolution. We are hopeful we soon will have either full resolution or a more at least more restored functionality.
  • Time: Jan. 9, 2023, 7:29 p.m.
    Status: Identified
    Update: We are continuing to work on resolving an issue in a service dependency to restore functionality.
  • Time: Jan. 9, 2023, 5:58 p.m.
    Status: Identified
    Update: We've identified some of the impacted areas and are currently working through them to continue restoring functionality to the platform.
  • Time: Jan. 9, 2023, 5:28 p.m.
    Status: Investigating
    Update: We're continuing to investigate the issue and are working on deploying some mitigations to address some conditions we've observed.
  • Time: Jan. 9, 2023, 5:01 p.m.
    Status: Investigating
    Update: We are currently investigating the issue.

Updates:

  • Time: Dec. 28, 2022, 5:26 a.m.
    Status: Resolved
    Update: A connection issue was found between our platform and a primary database. The issue with the connection has been restored and the issue is now resolved.
  • Time: Dec. 28, 2022, 5:20 a.m.
    Status: Investigating
    Update: We are investigating an issue causing the platform to not load correctly.

Updates:

  • Time: Oct. 26, 2022, 8:17 p.m.
    Status: Resolved
    Update: The replication lag event has recovered.
  • Time: Oct. 26, 2022, 7:59 p.m.
    Status: Investigating
    Update: We've observed slower than usual data replication across our databases. This can cause platform actions to not immediately appear and may lead to an increased number of error rates. We are working to identify the cause of this issue so we can take appropriate actions.

Updates:

  • Time: Oct. 26, 2022, 7:42 p.m.
    Status: Resolved
    Update: All data jobs have caught up and updates should process normally.
  • Time: Oct. 26, 2022, 7:40 p.m.
    Status: Monitoring
    Update: A fix has been implemented and we are monitoring the results.
  • Time: Oct. 26, 2022, 6:01 p.m.
    Status: Identified
    Update: We are experiencing higher than usual data processing loads and that is impacting is slower updates to external systems, We working on a fix to scale and handle the load.

Check the status of similar companies and alternatives to Field Nation

NetSuite
NetSuite

Systems Active

ZoomInfo
ZoomInfo

Systems Active

SPS Commerce
SPS Commerce

Systems Active

Miro
Miro

Systems Active

Outreach
Outreach

Systems Active

Own Company

Systems Active

Mindbody
Mindbody

Systems Active

TaskRabbit
TaskRabbit

Systems Active

Nextiva
Nextiva

Systems Active

6Sense

Systems Active

BigCommerce
BigCommerce

Systems Active

WalkMe
WalkMe

Systems Active

Frequently Asked Questions - Field Nation

Is there a Field Nation outage?
The current status of Field Nation is: Systems Active
Where can I find the official status page of Field Nation?
The official status page for Field Nation is here
How can I get notified if Field Nation is down or experiencing an outage?
To get notified of any status changes to Field Nation, simply sign up to OutLogger's free monitoring service. OutLogger checks the official status of Field Nation every few minutes and will notify you of any changes. You can veiw the status of all your cloud vendors in one dashboard. Sign up here