Development & Operations. Posts related to both development and upkeep of IT ops, not one or the other.
Enterprise application and computing environments have changed radically over the past fifteen years. Anyone who has spent even a day in an IT role can tell you that.What gets less attention, however, is how those changes undermine the ability of operations teams to do their jobs. The problem is that as computing and application environments have changed dramatically, workflows and org charts have not.
BigPanda is attending our first ServiceNow Knowledge15 event, April 21-24, at the Mandalay Bay Convention Center in Las Vegas. We’re hoping, though, that the relationships we build in Vegas, don’t stay in Vegas…
Earlier this month at BigPanda we released our new Sharing feature, which allows NOC teams to quickly share active and critical incidents with the right teams and subject-matter experts. BigPanda already helps NOC teams today by giving them instant visibility into incoming related alerts so that they don’t have to sift through dozens of emails and web pages with every outage or disruption. They can also attach playbooks and timeseries graphs directly to BigPanda, which means no more navigating around, combing through bookmarks, trying to find the right wiki page for that memory issue, or the right Graphite link for that misbehaving database host.
Ansible is a great automation tool. We use it for server provisioning, application deployments and running maintenance scripts. One problem it does have however, is how (in)convenient it is to run playbooks as opposed to regular shell scripts. Write and run enough Ansible playbooks, and eventually you’ll get tired of the repetitive typing your fingers have to do.
Service downtime is a harmful event to most technology businesses, especially to those who require their services to be constantly available. Downtime has many causes, such as hardware failures and network issues. In today’s web-scale world, application deployment is one of the main reasons for such downtime. This is particularly common with organizations performing Continuous Delivery, in which developers deploy their code at an unprecedented speed. Since there is always a good chance that the new code contains errors, the frequency of application changes holds a high risk of service malfunction.