Matt Blair

Matt Blair

I read that you learn more from a poor example than from a correct one. I don't believe this but that means my site will be a success.

3-Minute Read

Incident Review

Note: This was a series of talks I gave to leadership in my organization, so some of the comments here might not make sense in the context of a blog post.

This is not original work: I pulled heavilty from these four books to put together these talks:


If I did not site some of these works here, or pulled quotes directly from them, forgive me.

Define Strategic Intent

Develop and communicate a compelling vision for what the organization will become. Outline a strategy for achieving that vision.

Use SCQA approach to lay out:

  • The situation with the team
  • The conflict the team is facing
  • The question that naturally raises
  • The answer to the question
    • Your solution to get the team back on track
    • These are your A-List goals for the team

Your answers here should be:

  • Specific
    • We’re going to fix XXX
  • Measurable
    • Fixing XXX means we’ll see a YYY success rate
    • You want measurements to be paired
      • A 99% success rate means you’re measuring both success and failure
      • Response time is not a good metric if every response returns a 500
        • Pair your measure with a quality measurement
      • Any measurement is better than none, but the best measurements cover the output of the work, not the activity involved
  • Time-bound
    • This will take ZZZZZ months/sprints/whatever with VVVV engineers
  • Realistic
    • If your engineers say it will take ZZZ sprints, double that
    • Add in time for planning and testing
      • How to plan a project
    • State what you could do with more resources
      • If we increase the numbers of engineers by WWWW, we can reduce the time it takes to fix by UUUUU sprints
      • Also let folks know if increasing staffing won’t help the issue
    • Take into account holidays/vacations/on-call shifts when planning timelines
      • Build some slack into your scheduling!

You need to plan like the fire department plans. You cannot anticipate where the next fire will be, but you can shape an energetic and efficient team that is capable of responding to the unanticipated as well as ordinary events.

Pitch your plan to your management. Get buy-in.

You have to negotiate timelines for diagnosis and action planning. You have to be firm when you’re asked to change your plans - you must mention what this will cost your team to take on the additional work. Clarify expectations for what you can achieve early and often. Tell management what will happen if you do something vs if you don’t do something. Don’t ever let bad news be a surprise if something happens. Get in front of problems.

Clarify what you’re doing over and over again. Repeat yourself. Let folks know what you’re doing in every medium possible. Assume no one will remember anything you say. Repeat yourself.

Now that you’ve outlined your vision?

You have two options to meet your goals, especially if you’re struggling to deliver.

  1. Hire more people
    • Speaks for itself. In six months you’ll be able to do much more if you have more people
  2. Reduce Work in Progress
    • Start swarming on single tasks instead of having as many tasks in flight as you have people. Reduce concurrent work until you’re able to repay your technical debt. Tactically, the focus here is to help people transition from a personal view of productivity to a team view.
    • Announce that you’re stopping any and all non-essential activities until you meet one of your goals.
    • Either of these will narrow the focus of the team and allow you to produce on the tasks you’re selected more quickly.

These fixes, no matter what you choose, are slow. You’re fighting months or years of inaction. Conversely, the same thing that makes these fixes slow makes them extremely durable once they’re in effect.

Recent Posts



This theme was developed for Hugo.