Improving Service Availability

Any post about problems like downtime, outages, missed SLAs, etc.

How a Culture of Sharing Transforms IT Incident Management

By | 2018-04-17T18:32:00+00:00 January 22nd, 2015|Blog|

Earlier this month at BigPanda we released our new Sharing feature, which allows NOC teams to quickly share active and critical incidents with the right teams and subject-matter experts. BigPanda already helps NOC teams today by giving them instant visibility into incoming related alerts so that they don’t have to sift through dozens of emails and web pages with every outage or disruption. They can also attach playbooks and timeseries graphs directly to BigPanda, which means no more navigating around, combing through bookmarks, trying to find the right wiki page for that memory issue, or the right Graphite link for that misbehaving database host.

Golden Age of Developers = Nightmare for Ops

By | 2018-04-17T18:53:32+00:00 September 18th, 2014|Blog|

The last ten years have brought enormous changes to production environments, driven by a best-of-breed approach to production infrastructure enabled by open source and cloud.  This has been a boon for developers in terms of flexibility and productivity,  but it’s also placed a new set of challenges and expectations on Ops.

4 Ways to Combat Non-Actionable Alerts

By | 2018-04-17T18:36:02+00:00 April 23rd, 2014|Blog|

Many alerts place an unnecessary burden on Ops teams instead of helping them to solve issues. The main problem is that most alerts are not actionable enough:

  • They point to issues that don’t require a response
  • They lack critical information, forcing you to spend time searching for more insights in order to gauge their urgency

A Practical Guide to Anomaly Detection for DevOps

By | 2018-05-18T00:42:12+00:00 June 26th, 2014|Blog|

Anomaly detection for monitoring has been a trending topic in recent years. And while the math behind it is fascinating, too much of the discussion has revolved around histograms, moving averages and standard deviations. More discussion needs to happen around its practical applications, and for that reason, this practical guide to anomaly detection will attempt to provide an actionable overview of current off-the-shelf anomaly detection tools.

New Relic and BigPanda = #Monitoringlove

By | 2018-04-17T18:33:46+00:00 July 8th, 2014|Blog|

Monitoring applications in production has never been easier. With only a few code lines, you'll have New Relic installed and monitoring your application from nearly every angle. When something goes wrong, New Relic will start sending alerts. But then what? (hint – New Relic and BigPanda together is the answer).

Getting Started with BigPanda – Incident Analysis

By | 2018-04-17T18:52:34+00:00 October 15th, 2014|Blog|

BigPanda is an incident management platform for modern IT, NOC and DevOps teams. With BigPanda, you will prioritize and route your incidents better and faster, while vastly improving your team’s collaboration and processes. This is part 3 in a series on Getting Started with BigPanda. This product introduction will help you to get up and running quickly so you can get back to hunting fail-whales and 404 errors.

Getting Started with BigPanda – Assign Incidents

By | 2018-04-17T18:53:28+00:00 October 13th, 2014|Blog|

BigPanda is an incident management platform for modern Ops environments. With BigPanda, you will prioritize and assign your incidents better and faster, while vastly improving your team’s collaboration and processes. This is part 4 in a series on Getting Started with BigPanda. This guide will help you get up and running quickly and maximize the value you get out of the platform.

Getting Started with BigPanda – Incident Triage

By | 2018-04-17T18:52:30+00:00 October 17th, 2014|Blog|

BigPanda is an incident management platform for modern IT, Ops, and DevOps teams. With BigPanda, you will prioritize and route your incidents better and faster, while vastly improving your team's collaboration and processes. This is part 2 in a series on Getting Started with BigPanda. This guide will help you get up and running quickly and maximize the value you get out of the platform.