D-level and lower, TDM audiences.
Last week, Google announced several changes to its cloud platform. First, AppScale, the company that provides an open source implementation of Google’s application platform, Google App Engine, is receiving a direct investment from Google in order to accelerate the interoperability between AppScale and Google App Engine. This is a smart move, and it should help developers overcome the app portability issue that is ushering in a new era of vendor-lock within public clouds.
I met Vlad in the bar in Vegas after a long day of telco NOC drudgery. He was enjoying his whisky and clearly didn’t want to be interrupted by me asking about his datacenter. I could tell he’d rather I had asked about anything else… Cat Stevens, Greek myths, Faberge eggs. Anything. I interrupted him anyway and asked what’s required to go from the three nines he referenced in his keynote to the five nines his customers demand. He winced in pain. I thought he swallowed an ice cube or his Johnnie Walker was laced with cyanide. Turns out he was deep in thought. He proceeded to share wisdom that inspired me… to drink whisky and grow facial hair.
In my last post, I discussed how enterprise application sprawl, if left unchecked, puts organizations at risk. In this post, I’m going to discuss what to do about the problem. Today, any single department within even a mid-market enterprise will have more applications deployed than was standard – organization wide – just a dozen or so years ago. These apps include everything from cloud-based CRM to social media tools to AWS workloads to various big data tools to collaboration suites, and on and on and on.
Whether we practice more traditional operations processes with a 24x7 NOC and well-documented processes, or we’re embracing DevOps-styles with cross-functional teams and highly iterative methodologies, one problem we all face is the growing disconnect between our monitoring systems, the alerts they fire off, and the processes we’re using to handle operational issues. We log incidents in a ticket, but are the folks working on that ticket aware of the real-time status of the underlying incident?
Earlier this month at BigPanda we released our new Sharing feature, which allows NOC teams to quickly share active and critical incidents with the right teams and subject-matter experts. BigPanda already helps NOC teams today by giving them instant visibility into incoming related alerts so that they don’t have to sift through dozens of emails and web pages with every outage or disruption. They can also attach playbooks and timeseries graphs directly to BigPanda, which means no more navigating around, combing through bookmarks, trying to find the right wiki page for that memory issue, or the right Graphite link for that misbehaving database host.
We're excited to announce the release of a major new feature in BigPanda called Sharing! As you know BigPanda intelligently clusters your noisy alerts into high-level incidents. With our new Sharing feature, it's now easy to notify and collaborate with anyone on your team about critical incidents.
Last week was an exciting week. BigPanda announced $7 Million in funding from Sequoia Capital and Mayfield. We are super excited that these two firms share our vision for changing the way that IT and DevOps teams manage and respond to the thousands of IT issues they face every day. Last week, we also launched our offering into general availability. Check out some of the highlights from last week’s coverage on BigPanda from TechCrunch, GigaOm, Computerworld, 451 Research and more.
We engineers love measuring stuff. Whether it helps us solve an immediate problem, gets us ready for a bad day or just because most of us are information junkies, we love keeping track of metrics. The spectrum of what can be measured is very wide. It can include data from every part of our system: from technical metrics such as disk space or RPM, through UI metrics like page load times, to business KPIs such as revenue, conversion rates and so on. When choosing which metrics to collect, we usually start with the obvious ones: those that reflect the current state of the system (e.g., CPU, memory and load). There are quite a few articles and blog posts about these metrics, so I’m not going to discuss that here. Rather, I would like to focus on metrics that reflect the user experience.
Here are the four metrics that we at BigPanda see as the most important in this category: