Company Logo

Is there an PandaDoc outage?

PandaDoc status: Systems Active

Last checked: 31 seconds ago

Get notified about any outages, downtime or incidents for PandaDoc and 1800+ other cloud vendors. Monitor 10 companies, for free.

Subscribe for updates

PandaDoc outages and incidents

Outage and incident data over the last 30 days for PandaDoc.

There have been 1 outages or incidents for PandaDoc in the last 30 days.

Severity Breakdown:

Tired of searching for status updates?

Join OutLogger to be notified when any of your vendors or the components you use experience an outage. It's completely free and takes less than 2 minutes!

Sign Up Now

Components and Services Monitored for PandaDoc

Outlogger tracks the status of these components for Xero:

API Active
Creating and editing documents Active
CRMs & Integrations Active
Mobile application Active
Public (recipient) view Active
Sending and opening documents Active
Signup Active
Uploading and downloading documents Active
Web application Active
Webhooks Active
Website Active
API Active
Creating and editing documents Active
CRMs & Integrations Active
Mobile application Active
Public (recipient) view Active
Sending and opening documents Active
Signup Active
Uploading and downloading documents Active
Web application Active
Webhooks Active
Website Active
Component Status
Active
API Active
Creating and editing documents Active
CRMs & Integrations Active
Mobile application Active
Public (recipient) view Active
Sending and opening documents Active
Signup Active
Uploading and downloading documents Active
Web application Active
Webhooks Active
Website Active
Active
API Active
Creating and editing documents Active
CRMs & Integrations Active
Mobile application Active
Public (recipient) view Active
Sending and opening documents Active
Signup Active
Uploading and downloading documents Active
Web application Active
Webhooks Active
Website Active

Latest PandaDoc outages and incidents.

View the latest incidents for PandaDoc and check for official updates:

Updates:

  • Time: April 17, 2023, 3:43 p.m.
    Status: Postmortem
    Update: ## A summary of what happened At **14:01 PDT Friday, April 7th** our monitoring indicated that our Public API request rate dropped and health checks didn’t pass through. The situation deteriorated rapidly and we noticed that some of our API endpoints became unresponsive, which impacted the availability of the PandaDoc platform.  We followed our protocol and immediately started our incident response procedure, rolled back recent updates, and involved engineers in multiple investigation paths. After we had dismissed some initial theories, we understood that we had an issue connected with something on the infrastructure level and started investigating this together with our cloud provider \(AWS\). After a deep investigation that lasted several hours, we were able to track down the issue to network problems: several pods on a specific Kubernetes node were experiencing intermittent low-level network issues that caused connection leaks - repeatedly opening connections without closing them, or at least closing only some of them - which eventually led to increased latency and memory consumption and resulted in some of our core services entering a chain of crashes. As a consequence, Application, and API were not available during the downtime. Once the root cause was identified, the broken machine was removed from the cluster and the system started operating normally. The issue was fully resolved by **01:23 PDT, April 08**.  ## A deep dive - how we investigated the root cause When the incident started, we noticed a spike in the number of connections in our database pool and many API calls waiting for connections to be released so they could process incoming requests. We quickly figured out that what was stopping connections from being released was a large series of uncommitted transactions that were just waiting idle. We then started analyzing database locks and deadlocks because it is usually what might lead to this behavior and wrote a hotfix to one of our API endpoints to reduce the number of processed events expecting this would release connections faster.  Soon after this, we understood that the database was not a bottleneck, although we still had stalled transactions growing and the connections in the pool being taken and not released. We ran a deeper analysis of API endpoints metrics which revealed that external calls within the transactions could be the culprit. After more investigation, we found similarities in the API calls that were not responsive - they all interacted with our message queue \(RabbitMQ HA cluster\).  The RabbitMQ cluster was working without any disruptions for the last 1.5 years, and monitoring was not showing anything suspicious. It did not seem like a cause since queues are processing messages independently in async mode \(that’s why they are used to offload tasks to be executed asynchronously later\), but we still decided to look into it closer. After analyzing machines in the clusters and connecting to them directly we saw that they were shutting down and reloading periodically, although this was not visible in the cluster monitoring in our Grafana dashboards, nor did we get any alerts.  Since the message queue was unresponsive it led API calls to sit and wait for connection which led to blocking transactions which led to increasing in blocked connections in the database connection pool which led to waiting for other API requests for a new connection forever, in a loop that caused a chain of failures. We immediately started addressing the situation by scaling the cluster up vertically and upgrading the machines it's running on with more processing power and networking capabilities. After the upgrade, we’ve added additional monitoring metrics to the cluster.  In parallel, we were leading an investigation into a probable root cause: intermittent networking issues on a kubernetes node that were causing pods on that node to repeatedly open connections without closing them. We investigated deeper and realized that an underlying networking issue was the most probable root cause after we observed and correlated several facts: * We had randomly missing metrics in our Prometheus monitoring relative to several systems, coinciding in time with the degradation of the RabbitMQ cluster metrics \(the number of sockets started growing linearly\) * We found that all pods on one particular Kubernetes node \(that was added to the cluster on Friday morning\) were having trouble connecting to other parts of the system \(our NATS cluster\). We also noticed error patterns in logs related to closed network connections or client timeouts, in numbers higher than normal. At the same time, we also observed that the number of slow NATS consumers was growing abnormally since the start of the incident * Most of the connections to the RabbitMQ nodes during the incident period were coming from pods that were residing in the faulty node. Once the broken machine was removed from the cluster, the system started operating normally.  To sum up: we consider the main cause of the incident to be a problem with an AWS EC2 instance provisioned as part of our EKS \(managed k8s cluster\) that occurred during the normal process of a release. There were network-related errors that caused a number of connection issues on the RabbitMQ cluster leading to a chain failure.  ## What we have done and will be doing next  As our investigation wraps up, we want to highlight our continuous improvement mindset, and to provide clarity on what we are doing to improve our systems:  * We have improved the robustness and scale of our rabbitMQ cluster to reduce the likelihood of failure in case of a growing number of network connections and reviewed the HA RabbitMQ setup and its replication settings * We’ve added additional logging and metrics to our RabbitMQ cluster, as well as early detection alarms for any deviation in network traffic patterns for the cluster * We’ve engaged AWS in the investigation and resolution of this outage. AWS support is running their own investigation about the issue * We’ll do further improvements in our Observability stack, with a review of which additional metrics we can add to  improve the detection of underlying problems in AWS-managed services \(e.g. EKS\), reduce alerting noise and ensure certain alerts are highlighted \(RabbitMQ / failing pods\) * As an additional step to prevent this in the future, we’re planning to review all the external calls in our API handlers and move them to a transactional outbox to avoid blocking transactions if external services become unavailable.
  • Time: April 8, 2023, 9:23 a.m.
    Status: Resolved
    Update: We're all set! If you continue to experience any issues with this, please reach out to us at [email protected]. Thank you so much for your patience and understanding!
  • Time: April 8, 2023, 8:23 a.m.
    Status: Monitoring
    Update: We have resolved the issue and PandaDoc should now be up and running! We're still monitoring the application performance and will post a final “ALL SET” message once we’ve confirmed the fix produced consistent output.
  • Time: April 8, 2023, 8:17 a.m.
    Status: Identified
    Update: We’ve identified the issue root and are already working on a fix. We appreciate your patience and will post updates here as soon as possible. Stay tuned!
  • Time: April 8, 2023, 7:57 a.m.
    Status: Investigating
    Update: Unfortunately, the services are still being impacted by the outage. Please rest assured technicians and engineers are digging into it in order to provide the fastest fix possible. Please check back here for updates, we appreciate your patience while we are getting this resolved.
  • Time: April 8, 2023, 4 a.m.
    Status: Investigating
    Update: Sadly, the website is still experiencing technical difficulties, the team carries on putting all the possible effort into the investigation and resolution. Thank you so much for your patience and understanding.
  • Time: April 8, 2023, 1:03 a.m.
    Status: Investigating
    Update: The development and engineering team is doing their best to resolve the issue. Please check back here for updates, and we appreciate your patience while we get this resolved.
  • Time: April 7, 2023, 11:37 p.m.
    Status: Investigating
    Update: We’re really sorry for holding you up! Please know our engineering and operations teams are working hard to get everything up and running.
  • Time: April 7, 2023, 10:40 p.m.
    Status: Investigating
    Update: We’re on it! Our team is doing its best to get you back on track as soon as possible. Please check back here for updates.
  • Time: April 7, 2023, 9:57 p.m.
    Status: Investigating
    Update: Thank you for your patience! Our development team is already hard at work solving this issue and determining next steps. Check back here for updates and we will get this back on track as soon as possible.
  • Time: April 7, 2023, 9:27 p.m.
    Status: Investigating
    Update: We are continuing the investigation and doing our best to get to a resolution as fast as possible. Please check back here for updates, and we appreciate your patience while we get this resolved.
  • Time: April 7, 2023, 9:01 p.m.
    Status: Investigating
    Update: We are actively investigating the outage of the public API. Please check back here for updates, and we appreciate your patience while we get this resolved.

Updates:

  • Time: March 31, 2023, 5:50 p.m.
    Status: Resolved
    Update: This incident has been resolved.
  • Time: March 31, 2023, 5:47 p.m.
    Status: Monitoring
    Update: We’ve identified that the issue is connected to unexpected changes introduced on the Pipedrive API side. The issue is resolved now and the functionality is up and running again.
  • Time: March 31, 2023, 4:38 p.m.
    Status: Investigating
    Update: Thank you for your patience! Our development team is already hard at work investigating this issue and determining next steps. Check back here for updates and we will get this back on track as soon as possible.
  • Time: March 31, 2023, 4:08 p.m.
    Status: Investigating
    Update: We are actively investigating why Pipedrive integration is not working. Please check back here for updates, and we appreciate your patience while we get this resolved.

Updates:

  • Time: March 20, 2023, 6:20 p.m.
    Status: Resolved
    Update: We're all set! If you continue to experience any issues with this, please reach out to us at [email protected].
  • Time: March 20, 2023, 5:52 p.m.
    Status: Monitoring
    Update: We have resolved the issue and PandaDoc users should now be able to download, create, and complete documents. We're still monitoring the application performance and will post a final “ALL SET” message once we’ve confirmed the issue is fully resolved.
  • Time: March 20, 2023, 5:37 p.m.
    Status: Identified
    Update: We’ve identified the issue and are already working on a fix. We appreciate your patience and will post updates here as soon as possible. Stay tuned!
  • Time: March 20, 2023, 5:33 p.m.
    Status: Investigating
    Update: Our dev is actively working on resolving the issue. We do really apologize for the inconvenience caused and we appreciate your patience while we get this resolved.
  • Time: March 20, 2023, 5:16 p.m.
    Status: Investigating
    Update: We are continuing to investigate this issue.
  • Time: March 20, 2023, 5:15 p.m.
    Status: Investigating
    Update: We are actively investigating issues with PandaDoc document creation, completion, and download. Please check back here for updates, and we appreciate your patience while we get this resolved.

Updates:

  • Time: March 16, 2023, 12:16 a.m.
    Status: Resolved
    Update: This incident has ben resolved.
  • Time: March 16, 2023, 12:14 a.m.
    Status: Monitoring
    Update: We have resolved the issue and PandaDoc users should now be able to send a chat and email again] We're still monitoring the application performance and will post a final “ALL SET” message once we’ve confirmed the issue is fully resolved.
  • Time: March 16, 2023, 12:10 a.m.
    Status: Investigating
    Update: Zendesk, our chat and email system, is having an issue with its service. As a result, some customers are not able to start a chat with us. We'll continue monitoring and will post an update here once it's back up! In the meantime, you can reach out to us at [email protected].

Updates:

  • Time: Feb. 13, 2023, 5:39 p.m.
    Status: Resolved
    Update: We're all set! Users can now create documents via API. If you continue to experience any issues with this, please reach out to us at [email protected].
  • Time: Feb. 13, 2023, 5:32 p.m.
    Status: Monitoring
    Update: We have resolved the issue and PandaDoc users should now be able to create documents via API. We're still monitoring the application performance and will post a final “ALL SET” message once we’ve confirmed the issue is fully resolved.
  • Time: Feb. 13, 2023, 5:23 p.m.
    Status: Investigating
    Update: We are actively investigating the issue of creating documents via API. Please check back here for updates, and we appreciate your patience while we get this resolved.

Check the status of similar companies and alternatives to PandaDoc

Docusign
Docusign

Systems Active

Foxit
Foxit

Systems Active

Nitro Sign
Nitro Sign

Systems Active

Templafy
Templafy

Systems Active

Documoto System

Systems Active

SmartSuite
SmartSuite

Systems Active

Frequently Asked Questions - PandaDoc

Is there a PandaDoc outage?
The current status of PandaDoc is: Systems Active
Where can I find the official status page of PandaDoc?
The official status page for PandaDoc is here
How can I get notified if PandaDoc is down or experiencing an outage?
To get notified of any status changes to PandaDoc, simply sign up to OutLogger's free monitoring service. OutLogger checks the official status of PandaDoc every few minutes and will notify you of any changes. You can veiw the status of all your cloud vendors in one dashboard. Sign up here
What does PandaDoc do?
PandaDoc simplifies business document workflows, including proposals and quotes, with compliance to SOC 2, HIPAA, and GDPR regulations. Trusted by over 50,000 clients.