Last checked: 7 minutes ago
Get notified about any outages, downtime or incidents for PubNub and 1800+ other cloud vendors. Monitor 10 companies, for free.
Outage and incident data over the last 30 days for PubNub.
Join OutLogger to be notified when any of your vendors or the components you use experience an outage. It's completely free and takes less than 2 minutes!
Sign Up NowOutlogger tracks the status of these components for Xero:
Component | Status |
---|---|
Functions | Active |
Functions Service | Active |
Key Value store | Active |
Scheduler Service | Active |
Vault | Active |
Points of Presence | Active |
Asia Pacific Points of Presence | Active |
European Points of Presence | Active |
North America Points of Presence | Active |
Southern Asia Points of Presence | Active |
Realtime Network | Active |
Access Manager Service | Active |
App Context Service | Active |
DNS Service | Active |
Mobile Push Gateway | Active |
MQTT Gateway | Active |
Presence Service | Active |
Publish/Subscribe Service | Active |
Realtime Analytics Service | Active |
Storage and Playback Service | Active |
Stream Controller Service | Active |
Website and Portals | Active |
Administration Portal | Active |
PubNub Support Portal | Active |
SDK Documentation | Active |
Website | Active |
View the latest incidents for PubNub and check for official updates:
Description: ### **Problem Description, Impact, and Resolution** On Friday, August 11th at 23:45 UTC, we observed a delay in message delivery for subscribe requests using our Channel Groups service. After identifying the delay, we restarted the affected pods, and the issue was resolved at 01:44 UTC on Saturday, August 12th. ### **Mitigation Steps and Recommended Future Preventative Measures** To prevent a similar issue from occurring in the future, we are improving the Channel Groups service communication, as well as exploring enhanced error handling and retries to ensure improved monitoring and alerting.
Status: Postmortem
Impact: Minor | Started At: Aug. 12, 2023, 1:05 a.m.
Description: ### **Problem Description, Impact, and Resolution** At 19:20 UTC on Tuesday June 13th, 2023 we observed increased error rates and latency for our Authorization services at our US East facility. In response, we redirected authorization services from US East to US West, and the issue was mitigated at 20:59 UTC on Tuesday June 13th, 2023. During this time, we identified the root cause of the issue was due to a third-party service incident. After confirming the third-party service incident was resolved, we rerouted the Authorization traffic back to US East at 22:58 UTC on Tuesday, June 13th 2023. ### **Mitigation Steps and Recommended Future Preventative Measures** To prevent a similar issue from occurring in the future we are developing a comprehensive failover plan to more quickly move services from one region to other regions.. In the next few weeks we will be implementing new processes to allow mitigation of regional service issues.
Status: Postmortem
Impact: Major | Started At: June 13, 2023, 7:30 p.m.
Description: ### **Problem Description, Impact, and Resolution** On June 1, 2023, we observed two subscribe latency spikes in our Europe Point of Presence. The spikes occurred from 08:18 AM UTC through 09:06 AM UTC, and from 09:44 AM UTC through 09:54 AM UTC. During these times, users in the Europe region may have experienced slower than normal responses on subscribe calls. The higher than normal latency affected one of multiple access zones in the region. Shortly after detecting the increase in latency, the cause of the issue was identified, and a fix was deployed, restoring the region to normal operational status by 09:54 AM UTC on June 1. This issue occurred when a code deployment to the region overwrote a configuration previously deployed, resulting in a lack of resources in the access zone. ### **Mitigation Steps and Recommended Future Preventative Measures** To prevent a similar issue from occurring in the future, we are applying a fix to all clusters. Additionally, we are improving alerting around publish to subscribe latency so we are quickly notified if a similar issue were to occur.
Status: Postmortem
Impact: None | Started At: June 1, 2023, 3:18 p.m.
Description: ### **Problem Description, Impact, and Resolution** At 14:35 UTC on May 18, 2023 we observed some errors being served to subscribers globally. We noted a large, unusual traffic pattern that was putting memory pressure on parts of our infrastructure faster than our normal autoscaling could handle. We resolved the issue by manually adding capacity to cover the newly observed pattern. The issue was resolved at 16:15 UTC the same day. This issue occurred because the system was not prepared to scale quickly enough on the combination of factors that were unique to this traffic. ### **Mitigation Steps and Recommended Future Preventative Measures** To prevent a similar issue from occurring in the future we are adding new monitoring and alerting that can detect this scenario, as well as tuning scaling factors in our systems to allow our autoscaling to react more appropriately to it.
Status: Postmortem
Impact: None | Started At: May 18, 2023, 4:48 p.m.
Description: ### **Problem Description, Impact, and Resolution** On Tuesday May 2, 2023 at 21:23 UTC, we observed that Events and Actions messages were not being processed on our US PoP or EU Central PoP. Shortly after observing the issue, we redeployed a processing schema, and the issue was resolved on May 2, 2023 at 22:00 UTC. ### **Mitigation Steps and Recommended Future Preventative Measures** To prevent a similar issue from occurring in the future, we added the needed monitoring to alert us when Events and Actions messages are not being processed properly so we can redeploy the processing schema as needed.
Status: Postmortem
Impact: Minor | Started At: May 2, 2023, 10:01 p.m.
Join OutLogger to be notified when any of your vendors or the components you use experience an outage or down time. Join for free - no credit card required.