Ready to make incident response your competitive advantage?
See how Uptime Labs builds provable, scalable incident response capability across your organisation.
The surprising reason why incidents are where engineers are made
A couple of weeks ago, I scrolled past a LinkedIn post that was so eye-catching that I did something unusual. I scrolled back up.

This quote by Vanessa Huerta Granda (from this QCon talk that I unfortunately didn’t attend) immediately struck me as ‘on the money’, and kicked loose another question in my LinkedIn-addled grey matter. “How is dealing with incidents different from routine IT work?”
Serendipitously, this exact question was discussed in last week’s Uptime Labs workshop (Why Incident Management Breaks Down — and How the Process Should Flow) with Beth Adele Long and Laura McGuire, so I didn’t have to wait long for an answer. Check out the recording for the straight-from-the-source version, but I'll summarise my reflections here.
Also, if you stick around to the end of the post, I’ll reveal what incidents have in common with a certain Luke Skywalker family revelation, and the contents of Brad Pitt’s box in Se7en.

How is dealing with incidents different from routine IT work?
Anyone who’s been employed in the software domain for long enough will have experienced periods of time when routine work felt similar to an ongoing incident. Pressure and stress are by no means the exclusive preserve of incidents. Commercial urgency, unrealistic expectations, technical debt and complexity are merely a handful of the countless factors that can contribute towards routine work feeling like a perpetual incident. This perception can be particularly acute in the run-up to a deadline.
However, though the Venn diagram of incidents and routine work does overlap considerably, there are ways in which working on incidents differs from routine work, and these factors have important implications for how we prepare, learn, build tools and structure teams.
The cognitive demands are qualitatively different
Incidents aren't just more difficult versions of routine work; they impose a different kind of thinking. While uncertainty is a familiar presence in both types of work, incidents are often awash with equivocality (conflicting interpretations). It's common for incident responders to disagree about what’s actually happening, let alone what to do about it. Responders are required to anticipate how a situation might progress, and how that progression might change when interventions are applied. They’re continuously generating, testing, modifying and discarding hypotheses in parallel.
Response teams are frequently “a cast assembled for the occasion”
While routine work commonly occurs within well-established teams with well-established working relationships and practices, incident response teams are frequently ad-hoc; comprised of individuals selected for their specific expertise, area of concern, influence and availability rather than their shared familiarity. The cast may also include individuals with a wide variety of roles, e.g., senior leadership, legal, compliance and security folks. The combination of equivocality, plus the difficulty of coordinating between unfamiliar individuals with differing communication needs, greatly increases the challenge of incident response.
No single responder’s mental model is complete, correct or shared
This is also true in a broad sense for routine work, but is especially acute in incidents. We humans can only make sense of what’s going on behind software interfaces through mental models, and none of us have privileged read access to anyone else’s. Querying other folks' mental models is achieved through a lossy, unreliable protocol called “communication” - which offers few guarantees of receipt, integrity or error correction. The result of this is that folks tend to be frequently under the illusion that they’re 'on-the-same-page' whereas in reality they’re reading different books entirely.
While this is merely a reality of being human, it’s critical during incidents, when multiple parties are grappling together in situations of uncertainty, ambiguity, time pressure and high stakes. The process of 'staying on the same page' is referred to as maintaining common ground: a co-active process that needs to be managed carefully during incidents to keep the cast on track.
Incidents can be astonishing
It's rare to be astonished in routine work, where mental models of work-in-progress tend to build gradually over time. Routine work tends to start with a degree of uncertainty, which is incrementally chipped away at, revealing the sculpture beneath at progressively finer degrees of fidelity. Incidents, on the other hand, have the habit of forcing complete re-evaluations of everything you thought you ever knew about the system. Rather than a chisel tap revealing a fine detail, incidents can cause the entire sculpture to fracture or mutate, forcing a complete change to one's mental model.
As such, incidents are especially powerful opportunities for learning.
The Power of Surprise
Back to the original source of inspiration for this post, “Incidents are where engineers are made”. The complexities described so far that differentiate incidents from routine work are quite sufficient to render incidents especially fertile ground for growing accomplished engineers. However, the power of surprise is worth emphasising further.
To illustrate the point, consider the following questions: -
Who was Luke Skywalker’s father?
What was the answer to Brad Pitt’s plea, “What’s in the box?” in the movie Se7en?

Or perhaps a more recent cultural reference: What was Jon Snow's true parentage in The Game of Thrones?
If you know the answers to one or more of these questions, you really know the answer. The answers are probably imprinted so deeply in your brain that you’d sooner forget your name than these cultural references.
These iconic on-screen moments are iconic because they are surprising, astonishing or shocking. Few would remember Skywalker's heritage so vividly if the revelation had been mundane.
Surprise causes us to lean in. Surprise causes us to reconsider what we thought we knew; as such, it is a powerful catalyst for learning. If there’s one thing that separates incidents from routine work, it’s surprise. Surprise surprise… it’s not always DNS.
The other important aspect of surprise is its power to create great stories. Story-telling is a primary mechanism through which humans create and share knowledge. This reveals the potential to dramatically scale the learning associated with a surprising incident. While a movie audience learns vicariously from the experiences of characters depicted on screen, engineers can learn vicariously from the surprising stories of other engineers. The more surprising the better, and every incident contains the element of surprise.
Further reading
https://queue.acm.org/detail.cfm?id=3380779 - Laura Maguire.




