Kixie Status: Check if Kixie down or having an outage.

Kixie outages and incidents

Outage and incident data over the last 30 days for Kixie.

There have been 1 outages or incidents for Kixie in the last 30 days.

Severity Breakdown:

None: 0

Minor: 1

Major: 0

Critical: 0

Tired of searching for status updates?

Join OutLogger to be notified when any of your vendors or the components you use experience an outage. It's completely free and takes less than 2 minutes!

Components and Services Monitored for Kixie

Outlogger tracks the status of these components for Xero:

Call & SMS Functionality Active

Event API Active

Cadences

Cadence Functionality Active

Manage Cadences Active

Dispositions

Manage Dispositions Active

Hubspot

Hubspot C2C Active

Hubspot Call Logging Performance Issues

Hubspot SMS Logging Performance Issues

IVRs

IVR Functionality Active

Manage IVRs Active

Pipedrive

Pipedrive C2C Active

Pipedrive Call Logging Active

Pipedrive SMS Logging Active

Powerlists

Manage Powerlists Active

Powerlist Functionality Active

Queues

Manage Queues Active

Queue Functionality Active

Reporting

Agent Reports Active

Agent Summary Active

Business Reports Active

Call History Active

Dispositions Reporting Active

Inbound Summary Active

Queues Reporting Active

SMS History Active

SMS Reports Active

Time Saved Metrics Active

Ring Groups

Manage Ring Groups Active

Ring Group Functionality Active

Salesforce

Salesforce C2C Active

Salesforce Call Logging Active

Salesforce SMS Logging Active

Teams

Manage Teams Active

Zoho

Zoho C2C Active

Zoho Call Logging Active

Zoho SMS Logging Active

Component	Status
Call & SMS Functionality	Active
Event API	Active
Cadences	Active
Cadence Functionality	Active
Manage Cadences	Active
Dispositions	Active
Manage Dispositions	Active
Hubspot	Performance Issues
Hubspot C2C	Active
Hubspot Call Logging	Performance Issues
Hubspot SMS Logging	Performance Issues
IVRs	Active
IVR Functionality	Active
Manage IVRs	Active
Pipedrive	Active
Pipedrive C2C	Active
Pipedrive Call Logging	Active
Pipedrive SMS Logging	Active
Powerlists	Active
Manage Powerlists	Active
Powerlist Functionality	Active
Queues	Active
Manage Queues	Active
Queue Functionality	Active
Reporting	Active
Agent Reports	Active
Agent Summary	Active
Business Reports	Active
Call History	Active
Dispositions Reporting	Active
Inbound Summary	Active
Queues Reporting	Active
SMS History	Active
SMS Reports	Active
Time Saved Metrics	Active
Ring Groups	Active
Manage Ring Groups	Active
Ring Group Functionality	Active
Salesforce	Active
Salesforce C2C	Active
Salesforce Call Logging	Active
Salesforce SMS Logging	Active
Teams	Active
Manage Teams	Active
Zoho	Active
Zoho C2C	Active
Zoho Call Logging	Active
Zoho SMS Logging	Active

Latest Kixie outages and incidents.

View the latest incidents for Kixie and check for official updates:

High Server Load

Description: We have identified the cause of the server issues as a mass automation from from an account and we have implemented processes and infrastructure improvements to throttle and prevent these issues in the future.

Status: Postmortem

Impact: Major | Started At: Oct. 12, 2020, 6:59 p.m.

Updates:

Time: Oct. 12, 2020, 7:49 p.m.

Status: Postmortem

Update: We have identified the cause of the server issues as a mass automation from from an account and we have implemented processes and infrastructure improvements to throttle and prevent these issues in the future.
Time: Oct. 12, 2020, 7:49 p.m.

Status: Postmortem

Update: We have identified the cause of the server issues as a mass automation from from an account and we have implemented processes and infrastructure improvements to throttle and prevent these issues in the future.
Time: Oct. 12, 2020, 7:11 p.m.

Status: Resolved

Update: This incident has been resolved.
Time: Oct. 12, 2020, 7:11 p.m.

Status: Resolved

Update: This incident has been resolved.
Time: Oct. 12, 2020, 7:09 p.m.

Status: Investigating

Update: We are back up to 100% speed
Time: Oct. 12, 2020, 7:09 p.m.

Status: Investigating

Update: We are back up to 100% speed
Time: Oct. 12, 2020, 6:59 p.m.

Status: Investigating

Update: We are currently investigating this issue.

Calls and Dashboard are taking 15+ seconds to connect/load

Description: # 10/6/2020 Incident | ARR Owner | Keith Muenze | | --- | --- | | Incident | 10/06/2020 | | Priority | P0 | | Affected Services | All Services | ## Executive Summary Outbound and Inbound calling services on 10/06/2020 were interrupted due to an unusually high influx of inbound calls to a single telephone number operated by one of Kixie’s clients. Inbound calls to this number were configured to re-try a group every 6 seconds. The number of calls delivered to the number via an automated script eventually overloaded Kixie’s servers. ## AAR report | Instructions | Report | | --- | --- | | **Leadup** List the sequence of events that led to the incident. | Inbound calls can be routed to groups which in turn can call itself creating a never ending loop. This process has been replaced by our Queuing system but some client’s still use the old groups process. We helped a client of ours use this process to launch an automation which caused our system to be used in an unexpected manner. | | **Fault** Describe how the change that was implemented didn't work as expected. If available, include relevant data visualizations. | The automation caused a single groups process to be called 10k times per minute. This process is well above the typical volume of executions for our servers which cause our servers to overload and begin to queue activities. | | **Impact** Describe how internal and external users were impacted during the incident. Include how many support cases were raised. | All services were unavailable. | | **Detection** Report when the team detected the incident and how they knew it was happening. Describe how the team could've improved time to detection. | We detected the incident when a CPU alert from new relic notified our team. | | **Response** Report who responded to the incident and describe what they did at what times. Include any delays or obstacles to responding. | Keith Muenze responded to the emergency. He identified the problem after reviewing the New Relic, EC2, and RDS logs. The New Relic logs showed all servers were operating at 100% usage. He reviewed the performance insights logs in RDS to identify SQL which may be causing waits and CPU usage. No issues were discovered with the database. he reviewed the new relic transactions log to see if there was any specific increase in requests. After some research, he determined that the increase in volume was starting with our inbound call processing. Later he determined the issue was with a specific business and group. At that time, inbound calls to the group were immediately cancelled and service returned to normal. | | **Recovery** Report how the user impact was mitigated and when the incident was deemed resolved. Describe how the team could've improved time to mitigation. | Recovery could have been accelerated if new relic provided reporting which shows velocity increase of executions by function or script. Kixie could also write their own reporting to cover some of this using cloudwatch logs from the server. That level of reporting would help to identify the root cause quickly. New Relic has this data but does not show it in an immediately digestible report. | | **Timeline** Detail the incident timeline using UTC to standardize for timezones. Include lead-up events, post-impact event, and any decisions or changes made. | | | **Five whys root cause identification** Run a 5-whys analysis to understand the true causes of the incident. | | | **Blameless root cause** Note the final root cause and describe what needs to change without placing blame to prevent this class of incident from recurring. | The root cause of the outage was a lack of governance for inbound calls on a per business basis to a given group. We added thresholds for inbound groups calls to prevent this type of system abuse in the future. |

Status: Postmortem

Impact: Minor | Started At: Oct. 6, 2020, 5:28 p.m.

Updates:

Time: Oct. 13, 2020, 6:48 p.m.

Status: Postmortem

Update: # 10/6/2020 Incident | ARR Owner | Keith Muenze | | --- | --- | | Incident | 10/06/2020 | | Priority | P0 | | Affected Services | All Services | ## Executive Summary Outbound and Inbound calling services on 10/06/2020 were interrupted due to an unusually high influx of inbound calls to a single telephone number operated by one of Kixie’s clients. Inbound calls to this number were configured to re-try a group every 6 seconds. The number of calls delivered to the number via an automated script eventually overloaded Kixie’s servers. ## AAR report | Instructions | Report | | --- | --- | | **Leadup** List the sequence of events that led to the incident. | Inbound calls can be routed to groups which in turn can call itself creating a never ending loop. This process has been replaced by our Queuing system but some client’s still use the old groups process. We helped a client of ours use this process to launch an automation which caused our system to be used in an unexpected manner. | | **Fault** Describe how the change that was implemented didn't work as expected. If available, include relevant data visualizations. | The automation caused a single groups process to be called 10k times per minute. This process is well above the typical volume of executions for our servers which cause our servers to overload and begin to queue activities. | | **Impact** Describe how internal and external users were impacted during the incident. Include how many support cases were raised. | All services were unavailable. | | **Detection** Report when the team detected the incident and how they knew it was happening. Describe how the team could've improved time to detection. | We detected the incident when a CPU alert from new relic notified our team. | | **Response** Report who responded to the incident and describe what they did at what times. Include any delays or obstacles to responding. | Keith Muenze responded to the emergency. He identified the problem after reviewing the New Relic, EC2, and RDS logs. The New Relic logs showed all servers were operating at 100% usage. He reviewed the performance insights logs in RDS to identify SQL which may be causing waits and CPU usage. No issues were discovered with the database. he reviewed the new relic transactions log to see if there was any specific increase in requests. After some research, he determined that the increase in volume was starting with our inbound call processing. Later he determined the issue was with a specific business and group. At that time, inbound calls to the group were immediately cancelled and service returned to normal. | | **Recovery** Report how the user impact was mitigated and when the incident was deemed resolved. Describe how the team could've improved time to mitigation. | Recovery could have been accelerated if new relic provided reporting which shows velocity increase of executions by function or script. Kixie could also write their own reporting to cover some of this using cloudwatch logs from the server. That level of reporting would help to identify the root cause quickly. New Relic has this data but does not show it in an immediately digestible report. | | **Timeline** Detail the incident timeline using UTC to standardize for timezones. Include lead-up events, post-impact event, and any decisions or changes made. | | | **Five whys root cause identification** Run a 5-whys analysis to understand the true causes of the incident. | | | **Blameless root cause** Note the final root cause and describe what needs to change without placing blame to prevent this class of incident from recurring. | The root cause of the outage was a lack of governance for inbound calls on a per business basis to a given group. We added thresholds for inbound groups calls to prevent this type of system abuse in the future. |
Time: Oct. 6, 2020, 6:57 p.m.

Status: Resolved

Update: This incident has been resolved.
Time: Oct. 6, 2020, 6:20 p.m.

Status: Monitoring

Update: We are continuing to monitor for any further issues.
Time: Oct. 6, 2020, 6:09 p.m.

Status: Monitoring

Update: A fix has been implemented and we are monitoring the results.
Time: Oct. 6, 2020, 5:28 p.m.

Status: Investigating

Update: Calls and Dashboard are taking 15+ seconds to connect/load. Currently investigating

Calls and Dashboard are taking 15+ seconds to connect/load

Status: Postmortem

Impact: Minor | Started At: Oct. 6, 2020, 5:28 p.m.

Updates:

Time: Oct. 13, 2020, 6:48 p.m.

Status: Postmortem

Update: # 10/6/2020 Incident | ARR Owner | Keith Muenze | | --- | --- | | Incident | 10/06/2020 | | Priority | P0 | | Affected Services | All Services | ## Executive Summary Outbound and Inbound calling services on 10/06/2020 were interrupted due to an unusually high influx of inbound calls to a single telephone number operated by one of Kixie’s clients. Inbound calls to this number were configured to re-try a group every 6 seconds. The number of calls delivered to the number via an automated script eventually overloaded Kixie’s servers. ## AAR report | Instructions | Report | | --- | --- | | **Leadup** List the sequence of events that led to the incident. | Inbound calls can be routed to groups which in turn can call itself creating a never ending loop. This process has been replaced by our Queuing system but some client’s still use the old groups process. We helped a client of ours use this process to launch an automation which caused our system to be used in an unexpected manner. | | **Fault** Describe how the change that was implemented didn't work as expected. If available, include relevant data visualizations. | The automation caused a single groups process to be called 10k times per minute. This process is well above the typical volume of executions for our servers which cause our servers to overload and begin to queue activities. | | **Impact** Describe how internal and external users were impacted during the incident. Include how many support cases were raised. | All services were unavailable. | | **Detection** Report when the team detected the incident and how they knew it was happening. Describe how the team could've improved time to detection. | We detected the incident when a CPU alert from new relic notified our team. | | **Response** Report who responded to the incident and describe what they did at what times. Include any delays or obstacles to responding. | Keith Muenze responded to the emergency. He identified the problem after reviewing the New Relic, EC2, and RDS logs. The New Relic logs showed all servers were operating at 100% usage. He reviewed the performance insights logs in RDS to identify SQL which may be causing waits and CPU usage. No issues were discovered with the database. he reviewed the new relic transactions log to see if there was any specific increase in requests. After some research, he determined that the increase in volume was starting with our inbound call processing. Later he determined the issue was with a specific business and group. At that time, inbound calls to the group were immediately cancelled and service returned to normal. | | **Recovery** Report how the user impact was mitigated and when the incident was deemed resolved. Describe how the team could've improved time to mitigation. | Recovery could have been accelerated if new relic provided reporting which shows velocity increase of executions by function or script. Kixie could also write their own reporting to cover some of this using cloudwatch logs from the server. That level of reporting would help to identify the root cause quickly. New Relic has this data but does not show it in an immediately digestible report. | | **Timeline** Detail the incident timeline using UTC to standardize for timezones. Include lead-up events, post-impact event, and any decisions or changes made. | | | **Five whys root cause identification** Run a 5-whys analysis to understand the true causes of the incident. | | | **Blameless root cause** Note the final root cause and describe what needs to change without placing blame to prevent this class of incident from recurring. | The root cause of the outage was a lack of governance for inbound calls on a per business basis to a given group. We added thresholds for inbound groups calls to prevent this type of system abuse in the future. |
Time: Oct. 6, 2020, 6:57 p.m.

Status: Resolved

Update: This incident has been resolved.
Time: Oct. 6, 2020, 6:20 p.m.

Status: Monitoring

Update: We are continuing to monitor for any further issues.
Time: Oct. 6, 2020, 6:09 p.m.

Status: Monitoring

Update: A fix has been implemented and we are monitoring the results.
Time: Oct. 6, 2020, 5:28 p.m.

Status: Investigating

Update: Calls and Dashboard are taking 15+ seconds to connect/load. Currently investigating

Calls and Dashboard are taking 15+ seconds to connect/load

Description: Incident has been resolved and all Kixie services are fully operational. We will continue to monitor the situation.

Status: Resolved

Impact: Minor | Started At: Oct. 6, 2020, 3:31 p.m.

Updates:

Time: Oct. 6, 2020, 5:14 p.m.

Status: Resolved

Update: Incident has been resolved and all Kixie services are fully operational. We will continue to monitor the situation.
Time: Oct. 6, 2020, 4:14 p.m.

Status: Identified

Update: We are continuing to work on a fix for this issue.
Time: Oct. 6, 2020, 3:53 p.m.

Status: Identified

Update: Rebooting database server 2
Time: Oct. 6, 2020, 3:44 p.m.

Status: Identified

Update: restarting database server 3
Time: Oct. 6, 2020, 3:31 p.m.

Status: Investigating

Update: We are currently investigating. Note-> the call will dial out successfully after 15 seconds.

Calls and Dashboard are taking 15+ seconds to connect/load

Description: Incident has been resolved and all Kixie services are fully operational. We will continue to monitor the situation.

Status: Resolved

Impact: Minor | Started At: Oct. 6, 2020, 3:31 p.m.

Updates:

Time: Oct. 6, 2020, 5:14 p.m.

Status: Resolved

Update: Incident has been resolved and all Kixie services are fully operational. We will continue to monitor the situation.
Time: Oct. 6, 2020, 4:14 p.m.

Status: Identified

Update: We are continuing to work on a fix for this issue.
Time: Oct. 6, 2020, 3:53 p.m.

Status: Identified

Update: Rebooting database server 2
Time: Oct. 6, 2020, 3:44 p.m.

Status: Identified

Update: restarting database server 3
Time: Oct. 6, 2020, 3:31 p.m.

Status: Investigating

Update: We are currently investigating. Note-> the call will dial out successfully after 15 seconds.

Check the status of similar companies and alternatives to Kixie

NetSuite

Systems Active

ZoomInfo

Systems Active

SPS Commerce

Systems Active

Miro

Systems Active

Field Nation

Systems Active

Outreach

Systems Active

Own Company

Systems Active

Mindbody

Systems Active

TaskRabbit

Systems Active

Nextiva

Systems Active

6Sense

Systems Active

BigCommerce

Systems Active

Frequently Asked Questions - Kixie

Is there a Kixie outage?

The current status of Kixie is: Minor Outage

Where can I find the official status page of Kixie?

The official status page for Kixie is here

How can I get notified if Kixie is down or experiencing an outage?

To get notified of any status changes to Kixie, simply sign up to OutLogger's free monitoring service. OutLogger checks the official status of Kixie every few minutes and will notify you of any changes. You can veiw the status of all your cloud vendors in one dashboard. Sign up here

What does Kixie do?

Kixie is a sales engagement platform that enhances sales team productivity through dependable and automated calling and texting.

Is there an Kixie outage?

Kixie status: Minor Outage

Kixie outages and incidents

There have been 1 outages or incidents for Kixie in the last 30 days.

Severity Breakdown:

Tired of searching for status updates?

Components and Services Monitored for Kixie

Cadences

Dispositions

Hubspot

IVRs

Pipedrive

Powerlists

Queues

Reporting

Ring Groups

Salesforce

Teams

Zoho

Latest Kixie outages and incidents.

High Server Load

Updates:

Calls and Dashboard are taking 15+ seconds to connect/load

Updates:

Calls and Dashboard are taking 15+ seconds to connect/load

Updates:

Calls and Dashboard are taking 15+ seconds to connect/load

Updates:

Calls and Dashboard are taking 15+ seconds to connect/load

Updates:

Check the status of similar companies and alternatives to Kixie

Frequently Asked Questions - Kixie

Is there a Kixie outage?

Where can I find the official status page of Kixie?

How can I get notified if Kixie is down or experiencing an outage?

What does Kixie do?

Start monitoring now!