Improvement and imperfection

Improvement and imperfection

I had a minor crisis of confidence earlier today looking at some of the things I've done. Most of my impacts could be uncharitably described as, "you took something that was outrageously shit and made it merely kinda bad". But on second pass I don't think this is a problem with the work I do. I think the problem is that the people who do this kind of work don't write about it. This creates a spiral where the only stories are those with perfect happy endings, so anyone whose ending is muddy and somewhat less than ideal feels like they shouldn't share that lest they come across as a colossal fraud who has no idea what they're doing.

This isn't uniquely my viewpoint. I have friends who've talked about writing long and detailed posts about something they worked on, only to delete them because the outcome was, "it sorta worked, but not as well as we hoped."

Reality is messy. Even as a huge fan of Extreme Programming, I encourage reading the full history of the Chrysler C3 project in which XP was introduced, and not just the first year which went so well. (In fairness to Kent Beck and Ward Cunningham among others, their own materials and the c2 wiki are very forthcoming about what went wrong as well as what went right. It's mostly the breathless retellings which omit what happens when you're still trying to figure out what a product manager is.)

What this means is I don't think any of us should feel discouraged when we get from a state where releases are anywhere from 6 to 12 weeks late to a mere 1-2 weeks. The hard and messy work needed for that improvement is just as valid and important as the people who had better contexts or fewer constraints and could get down to the 0 they were aiming for. More importantly, we need to start telling these stories that end with more still to do. Because the tales that end in perfection are often missing some constraints the rest of us have to face:

  • Team members who've had bad experiences with Scrum/DevOps/whatever and are heavily opposed to it.
  • Entrenched management structures you have no hope of changing and need to work around.
  • The one person who adopts all of your lingo and superficial behaviours, but rigidly sticks to the same core approach throughout.
  • That even if you have the perfect red team and are doing great things, mess is being created around you faster than you can clean it up.

As an industry, we need the stories that include these adverse starting points. The value of my experience introducing Scrum to teams at Priority Pass back in 2016 wasn't the textbook story of, "oh yes, we split our work into sprints and started having daily stand-ups" - it was working out how to interface with an organisation that at the time still wanted all of the classical Waterfall artifacts, how to avoid our internal team metrics being misinterpreted and used against us, how to get the freedom to innovate... hell, even how to get my project manager to trust me.

(And I realise I never properly wrote that story, not least out of shame that the end result was a software version of the classic Little Caesars joke: "It's hot and it's ready." / "Is it good?" / "It's HOT. And it's READY." )

This also helps the people who did have the perfect experience, and are now struggling in a murkier environment. I once worked with a wonderful guy who'd done the perfect agile transformation - with a team who all really wanted to do Scrum, and a senior management who were totally bought in to the idea of trying something different. When faced with a team who liked working in hopeless inefficiency and always having some process-related excuse as to why they hadn't done any real work, and senior management who didn't care, he had a very hard time and didn't have anywhere useful to get advice. Imperfect stories matter.

So this is a plea, and perhaps a little bit of a note for self. When you go through six months of gruelling work to reduce a MTTR metric from 4 days to 2-3 hours, don't think it's pointless to write about because the people currently writing about recovery times are mentioning timescales in the low minutes. A lot of those are coming from organisations who throw huge amounts of effort and resources at improving their MTTR. Your experience might be a lot more relevant for someone who finds themselves in a small team with everything on fire and a total budget for fixing the problem that wouldn't buy the team a BLT, let alone an SRE.

I'm just as bad at this, and most of the time when I talk about the reality and the mess it's in stories of outright failure, but I have managed to publish a couple of stories about taking something that was outrageously shit and making it merely kinda bad:

I should write more of these. You should write more of these. And none of us should feel unhappy that we've only improved things until they were kinda bad - the important part is we improved them.