The power of context in root-cause analysis

4 min read
Time Indicator

The ability to quickly and accurately identify the root cause of IT incidents is paramount. According to EMA Research, more than 80% of IT professionals said a solution that could generate an accurate summary of alerts and incidents, including the likely root cause, would be transformational or high value. Respondents noted that such a solution would reduce mean time to resolution (MTTR) by 10 to 30 minutes.

Earlier this year, we shared strategies for mastering the identification of root-cause change with our foundational functionality. Since then, we’ve made significant improvements to BigPanda Root Cause Changes that leverage generative AI within our Advanced Insight module. This update transforms the approach to root-cause change correlation and causal analysis, making incident resolution faster and more precise than ever before.

The updated feature accurately correlates low-quality or vaguely described changes with active incidents. Using GenAI, BigPanda Root Cause Changes can normalize concepts, recognize intent, understand context, and derive meaning from change records and potentially related alerts. This enables high accuracy in correlating alerts with incidents and determining their root causes. For example:

  • Normalization: Replacing manual approaches that take hours to complete, BigPanda normalizes and interprets various data inputs and concepts using GenAI. It intelligently processes information from change records and alerts to reveal likely causality.
  • Natural Language Processing: BigPanda leverages advanced NLP techniques to parse the contents of change records to extract critical insights and intent. Within seconds, operators can differentiate between planned maintenance and unexpected outages, correlate seemingly unrelated terms or phrases, recognize the purpose of specific changes, and understand their potential impact on the IT environment.
  • Contextual Correlation: BigPanda Root Cause Changes aligns the contents of change records with correlated events found inside BigPanda incidents. Cross-referencing data points can identify patterns and relationships that may not be immediately apparent to human analysts.

Together, these processes create a high-accuracy causal confidence score that operators see in their incident console. This score quantifies the likelihood of a particular change being the root cause of an incident. With this information at their fingertips, IT teams can prioritize their efforts to focus on the most probable causes and resolve incidents faster. Shifting from triage to resolution with greater confidence reduces downtime significantly and improves overall operational efficiency.

See how it works

In this example, we can see how BigPanda Root Cause Changes understands context indirectly and derives meaning in a way that identifies causality — even without direct links to the active incident.

You’re handling an incident in BigPanda that includes an alert for a monitoring system that has been reset and a change record with an implementation plan calling for a “shutdown.”

Alert details that the system uptime has been reset.

Root cause change record with a detailed implementation plan

Although the two records use different language — “system uptime has reset” and “shutdown /r” — BigPanda Root Cause Changes derives situational context. It uses the context to analyze the incident and deduce that the reset align to the Windows shutdown command in the change record. This awareness of context and ability to use it for analysis improves the accuracy of root cause change and confidence in its causality — not to mention the time it takes to triage and resolve the incident.

AI-powered root cause change analysis

There’s no doubt that Gen AI-powered Root Cause Changes provides significant benefits, improving many aspects of incident analysis and resolution, including:

  • Root-cause understanding: Greater accuracy of incident causality gives operators a deeper understanding of what caused the incident as well as confidence in the potential resolution steps.
  • Triage and resolution: With faster root-cause identification, you can initiate resolution steps and bring in supporting teams more quickly, leading to MTTR improvements.
  • Efficiency: High-confidence root cause causality frees operators and responders to focus on incident resolution instead of spending time on tedious, manual searches for potentially relevant information.

Empower ITOps and ITSM teams to resolve incidents more swiftly and accurately. BigPanda Root Cause Changes integrates advanced normalization, contextual understanding, and high-accuracy causal confidence scoring. Available today, check out a self-paced demo of the BigPanda Advanced Insight module to learn more.