The full story

“Our NOC team is excellent at what they do, but we could never hire enough engineers to investigate every alert manually, particularly on peak traffic days when the business relies on us most. We added BigPanda to our operational tools suite to help us find the right alert before any customers are impacted. We evaluated many products and selected BigPanda because of its modern user interface, tight integration with our ServiceNow ticketing system, and native SaaS architecture.”
– Vismay Thakkar, Gap Senior IT Director

Service excellence requires excellent people, processes, and tools

As the scope of services has increased and the complexity of Gap’s infrastructure has grown, the volume of monitoring alerts has steadily increased and so has the impact of down time. Peak periods of demand, like Cyber Monday, place significant load on infrastructure which complicates capacity and problem management for engineering and ops teams. Unfortunately, what hasn’t increased is Gap’s IT headcount. To guarantee compliance with high service expectations, Gap needed to understand which alerts were actionable, then triage operational incidents and identify their root cause, while identifying trends to prevent the same issues from recurring. After evaluating many traditional and modern event management solutions, Gap selected BigPanda.

Fewer alerts and improved visibility into infrastructure health

Gap deployed BigPanda to aggregate and correlate monitoring alerts from systems management tools like Nagios, plus log analytics tools like Splunk. Operational incidents are synchronized with ServiceNow tickets to ensure integration with Gap’s existing NOC collaboration process. Previously, alerts from Nagios created thousands of noisy ServiceNow tickets, making it difficult for NOC engineers to quickly identify critical issues. Leveraging BigPanda’s native ServiceNow integration, Gap was able to dramatically reduce ticket volume in ServiceNow and keep those tickets updated in real-time.

Summarizing related issues for intelligent analysis

Gap leveraged BigPanda to correlate all alerts into meaningful incidents in ServiceNow. For example, if there is a problem on the network edge, BigPanda might look at dozens of alerts to determine that a single switch went offline. It populates a ticket in ServiceNow which summarizes all related issues for one device. BigPanda maintains incident records that are married to other incidents in ServiceNow. It knows exactly when to update an existing incident, rather than create a new one. This dramatically reduces the total number of incidents. It’s an example of the type of intelligence that BigPanda brings to alert monitoring, correlation, and incident analysis.


Tools integrated:
Nagios and ServiceNow, with Splunk and Chef integrations planned

Gap, operator of nearly 4,000 stores in 90 countries, is one of the world’s most respected retail brands. Delivering service excellence at Gap requires a team of NOC engineers and developers working closely to assure availability and performance across three key delivery channels: retail, e-commerce, and corporate IT.

Want us to walk you through it?

Sign up for a demo presentation and we’ll schedule a one on one
Request a Demo or Try It Free