Semi-anonymous Internet idiots like me love telling you all these things you should be doing about team structure, organisational approach and Really Exciting Technology but we're often a bit light on what happens when you actually do these things, and your self organising consensus driven team hits the switch for the automated Terraform apply that's gonna power your entire business on AWS Lambda.

Here's some balance. There's a few people talking about the idea that maybe we're chasing the wrong thing trying to chase rockstars. Maybe rather than "building" teams with the implied air quotes, we should really be building them: supporting people, mentoring them, helping them to build their skills and learn to value the things our organisation values. And well... I kinda did that a year or two ago. Here's how it went down.

Context

At the start of this amorphously-defined time period I was getting a bit of a reputation as a recruiting troubleshooter in the highly anonymous (and impossible to discover within five minutes of LinkedIn stalking) organisation I inhabited at the time. This is a roundabout way of saying I clearly didn't have much on and so therefore could always be relied on to make up the numbers for an interview panel.

Why was this unknown organisation hiring? Well, we had a big problem with our mobile app. It was getting hammered in app store reviews and generating a lot of customer complaints. Problem was, even if we built a fancy new one from scratch we'd still be connecting it to a horrific ball of mud that would take several minutes to decide whether you were getting an error, or 40MB of XML that'd take half the day and most of your battery to deserialise. People preferred the error.

Most of the team were busy fighting all the fires that tend to crop up when your services communicate by building the biggest ball of XML they can and throwing it at each other with no backup plan, so we needed to build a new team to create the APIs that would power our next generation of mobile apps. A team who understood networks, clouds, API development and a little bit about mobile on a visceral level. When we started, this was not something our hiring pipeline was ready for: it was geared up to provide people who'd done 15 years of purgatory in horrific enterprise environments and didn't complain too much when you told them they needed to use three different timesheet systems.

As I later wrote, I was getting fed up with interviewing people who had half a dozen "senior super awesome architect developer guru" positions on their CV and demands somewhere in the region of £70k/yr who went into a practical test and were like, "derp what's a for loop", "herp I'd just do this with a 300 line switch statement in real life" and "how dare you have the arrogance to demand that I, a programmer, should do some programming as part of my interview for a programming job". This was annoying me in a world where keen and highly skilled graduates find it hard to get a job because they don't have 16 years of experience in somehow never writing a program that contains a control flow structure.

What Happened Next

I found out that one of my regular co-interviewers had many of the same frustrations. Together we started a campaign of gradually beating down everyone involved in the pipeline to the point we were allowed to include people who didn't have decades of experience, but did have a lot of enthusiasm and willingness to learn.

Of course in the real world these things take more than a montage and a slice of early-'80s power pop soundtrack to happen. So we took the least-worst candidate (the only one who could actually program) and threw him in with a bunch of pirates and anarchists we'd managed to assemble from personal recommendations, contractors and people poached from elsewhere in the group of companies.

However, we did something interesting. We didn't go, "ooh, you're from a big enterprise C# code factory that spends all day churning out boilerplate code, you'll want all the boilerplatey code tasks." We went, "here are some AWS credentials, here's our Terraform repo, here are some bits from the last retro nobody's picked up, and by the way you're running next week's daily scrums". And you know what? The guy we hired, who'd spent most of his life toiling in thankless code galleys, absolutely crushed it. I've hardly ever seen someone uncover such a joy of learning so far into a career. We had some challenges, and I think (name redacted, who will clearly still be able to identify himself) had a hard time overcoming some really deep-seated assumptions about How Things Work - but we talked about these openly, and I tried my best to be empathetic in these discussions.

We were very lucky, though. One of the problems with our hiring pipeline wasn't just that it was delivering a certain type of candidate who was considered "fully formed", they were expecting us to consider them fully formed. If we brought people on from this pipeline and asked them to join a team that had a very different set of values, it didn't work for anyone involved. So we went back to trying to beat everyone down into considering a different type of candidate. And we got another piece of luck: there were some new and enthusiastic people dealing with our recruiting, who thought what we were suggesting was batshit insane but they were so fed up with disastrous phone screens that they were willing to give it a try.

Team XP was about to be born. (Those who lived this story will know how close that supposedly-anonymous designation is to the actual team name. What can I say, there's a reason I end up with so little work on I can spend entire weeks interviewing people.)

Creating Team XP

Something which tends not to get mentioned in a lot of these blogs about how you should adopt Spleen or run all your applications through 555timr is the impact on the existing team, unless it's one of those high-minded articles on, "How To Introduce Northern Soul To Your Team Process In 7 Easy Steps". Changing the kind of people who are going to be rocking up to interview is pretty fundamental, and you need to have some conversations about it. Conversations where you might not get what you want.

Therefore, the first thing for me to do was have a rather humble talk with the guy who was going to lead Team XP. If he wasn't down for this, then it should not and would not happen. "Start with the good stuff," I said. "Use that to introduce the ideals of the Wigan Casino scene. Then introduce the classics that everyone should know."

Oh wait, wrong universe. What I actually said was this:

"We're thinking about bringing junior people in. What do you think? If we do this, it's going to blow up your team. It'll kill your productivity. You're going to have to learn how to onboard people who've never worked in an environment like ours, possibly haven't worked in any environment at all. It's hard work, but it's also going to give you a lot of new skills and help you understand the ones you already have."

He enthusiastically agreed, interviewed most of the candidates, then went on a long holiday and completely forgot about both the conversation and the interviews, demanding to know why there were suddenly three new junior devs in his team. Y'know, maybe every once in a while I question my faith in the "pirates and anarchists" approach to team formation...

About those interviews

Our first interviews with candidates from the "inexperienced and enthusiastic" pool were utter car crashes. We kind of shot ourselves in the foot by the fact our budget didn't stretch further than trying to dredge an interview laptop out of the Thames, so the first couple of candidates were trying to write C# on a shopping trolley or a discarded hire bike with a missing wheel, but even once we had a functioning laptop with working copies of Visual Studio and VS Code installed it was still pretty bad. After so many years of interviewing people with 10 years or more in the game, we simply didn't know the right questions to ask or what kind of responses we'd get.

We tried to formalise things with a score sheet, but in reality this was little more than a crutch to let us pretend we were being in the slightest bit competent and interviews were decided with statements such as:

"I liked the bit where they installed Visual Studio on the interview laptop we'd failed to install Visual Studio on. That was good."

"I'm not sure someone who responds that way to being given no mouse, then a broken mouse, then a wireless mouse which runs out of battery after 5 minutes is going to cope with our organisation."

"Did we all somehow forget to ask anything useful? Eh, they're keen on Python and have a Github, it'll probably be okay."

Three people got offers of employment before we had the slightest clue what we were supposed to be doing in the interview room, on the strength of their ability to work with us and not be too fazed by our apparent incompetence. I don't want to give too many spoilers for the rest of the article here or (worse) encourage anyone to turn shambolic disorganisation into their formal interview process but they were all awesome. As I may have said before, look for people you can work with.

The inevitable bit about managing millennials

Within a short space of time we'd achieved our aim of blowing up the team, stressing out our team lead and destroying all productivity. What was interesting is that while this felt like a lot of hard conversations, coaching and occasional commiserating in the nearest wine bar, it was a very short amount of time. The team was productive quickly for any newly-formed team, let alone one without much experience. A lot of this was down to how we managed it.

This is where the inevitable topic of the moment comes in: yes, most of the new team members were either late millennial or early Gen-Z, and I know there are about a billion slides and videos telling you how you're supposed to manage this cohort. So what special millennial-focused thing did we do to manage our team?

Absolutely nothing.

We didn't care how old people were (or weren't). If anything we had more of an anti-strategy than a strategy, but if I had to codify it then I'd go for something like this:

  • Assume people are functioning adults capable of independent thought
  • Explain the problems you're trying to solve
  • Openly and honestly share the constraints you're operating within
  • Provide a safe environment for exploration and experimentation

I think these would benefit from a bit more detail about how we approached them, because I quite frequently see people talk about these same things and then utterly fail to apply them

Assume adulthood

Yes, your copy of Different Class may be older than some of the people in your team. But guess what? That's a 24 year old record. Someone born after Disco 2000 faded from the charts has still negotiated with a landlord at the sharp end of the housing market, budgeted their monthly finances, and prepared several thousand meals that didn't come from the parental kitchen, the school canteen or the neon-lit temple of Tennessee Chicken, Rib and Pizza. Sometimes without even setting off the smoke alarm.

What's your justification for telling these people they can't be trusted to solve coding and organisational problems by themselves? I know those of us getting at least some shelter from the umbrella of Generation X and beyond like to talk about life experience and knowing how organisations work, but a lot of that is less useful experience and more unhelpful false assumptions. Most of this is just basic reasoning skill. It doesn't need to be carried out by people who if left to their own devices will eventually listen to a Dire Straits record. (My excuse is that one of the dogs really likes So Far Away.)

Most of the received organisational wisdom that supposedly "junior" people can't solve problems is more that they never bothered to ask. Or didn't like getting a good answer.

Explain the problem

This is another thing which seems unexpectedly difficult. So let me be clear about this. A problem is not, "I need this thing programmed using the C# 5 async and await operators to allow it to scale beyond I/O limitations". A problem is, "this is going to see 300 requests per minute throughout the day" or "some of our customers will be in China behind a national firewall unable to access instances in eu-west-1".

I think this relates to the previous point, in that one of the things you do pick up from years of painful corporate experience is what someone actually means when they tell you to use a particular language feature or do things in a certain way, even though everything you've learnt so far tells you that's a stupid idea. But while it takes a while to learn how to decode a statement 2 or 3 steps removed from a problem into the actual problem, almost anyone can solve a problem if you take away the indirection.

In fact, this was one of the ways in which Team XP was better than a more battle-hardened team, because they hadn't built up the expectation that they'd have problems explained to them in terms of a potential (and likely quite wrong) solution.

Of course, things weren't always perfect and sometimes the team would come up with a solution that was about as appropriate as making a wedding reception playlist from the collected works of Grazhdanskaya Oborona. The important thing we learnt here was that getting into arguments about solutions didn't help anybody learn anything. Instead, asking questions like, "will our extended family dance to Долгая счастливая жизнь?" and letting the team work out the answer is "probably not" was more effective long-term, even if it was time-consuming initially.

(I still have flashbacks to the whiteboard session in which we worked out how we would extract small, shippable deltas from the 40MB XML endpoint of horror. But we got there eventually, and it gave us one hell of an interview question to ask.)

Share constraints

One of the areas where decades of experience are useful is understanding organisational constraints: that sense of what can or can't be done, and what of the things people say can't be done is actually malleable.

But do you know how many people with this experience you actually need on the team? One. Or hell, even none if they're interacting with a self-aware stakeholder who can provide this guidance. Because one of the wonderful things we've invented on our long journey from single-celled organisms is this idea of communication. You don't need everybody to have 20 years of painful memories about what you can and can't do in a company: you just need one person who can share the results of that pain.

With Team XP this was an extension of sharing the problem: because one of our wider goals was challenging organisational norms, part of our problem statement would inevitably be things like, "these are the things we can get Information Security to sign off on, these are the things we might be able to discuss, and these are the red lines that aren't worth even attempting to talk about" or, "how can we do this when our product organisation doesn't understand this on any conceptual level?"

I think it helped a lot that we talked about why the constraints were there, and how we'd inflicted them on ourselves as an organisation. Also that when we discussed the more people-focused constraints, we did it with a fair amount of empathy. We encouraged the team to think about why people did the things they did rather than write them off as idiots and incompetents. Well, mostly.

(Again, this was one of those things where lack of experience turned out to be useful, because the team hadn't formed a bunch of Dilbertesque prejudices about the workplace.)

Provide a safe environment

As with any team things could (and did) go wrong. Everything from the usual missed sprint goals and misunderstood tickets, to blowing up our entire back end because we connected something that scaled effortlessly to something that... didn't. However, we also knew that one of the wider problems in the organisation was teams who were stuck in ineffective patterns due to being unwilling to experiment, because they were the ones who got the blame for it.

What we needed were two types of safety:

  • Technical safety: When things went wrong, they needed to go wrong in an environment where it didn't matter.
  • Psychological safety: There would be no penalties for trying something new, and having it not work out.

The latter may sound like a recipe for disaster, but this is because our discipline came from a different place. While we were clear that failure was acceptable, we also made it clear that having something fail and then not learning from it was not. In other words if things broke and they did not result in documentation, tests, or changes to team behaviour then everyone was going to have a bad time.

Technical safety was simply the classic case of shifting left. It's safer for something to break in a test or develop environment than on production, and it's way safer for it to break in a unit test before the code has gone anywhere. This also meant faster feedback cycles, which are very useful when people are still learning about things.

Of course, this article is about what actually happened rather than Internet idealism. So in reality we found it hard to get everyone to care about testing to a level where we were catching problems early on, and we had a lot of problems with our dev and test environments not really being representative of our production one, leading to low levels of confidence that something would work once it was fully out the door. What helped a lot was late in the team's existence, when we got a very experienced tester who was very good at playing "grumpy old man" when testing didn't meet the standards it should have met.

Before that I don't think we ever cracked technical safety properly. It wasn't necessarily "good" even at the end, but at least we'd achieved a level of moderate okay-ness. But we still never let this (and the consequence of things going wrong because of it) affect our dedication to psychological safety. As a result of this, most of the major problems were identified and learnt about early on, and while change failure rate was never perfect it was a lot lower than typical for the rest of the organisation.

First Reckoning

So we did these things, and the team started establishing working norms and beginning to deliver things. Quite exactly how always seemed to be a popular topic, as any time you went near the team area it looked like uncontrolled chaos. I remember a quite senior colleague coming over to ask me a question while we had the whole team mob programming a problem, and later saying, "this is the only set of desks in the building where I have no idea what is supposed to be going on". Sometimes I'm not sure I did either, as there very rarely appeared to be any actual coding happening, but we seemed to be shipping APIs to clients, getting more and more of the mobile app working and committing many PRs to the central Terraform modules repo so I didn't investigate too deeply in case it all stopped working.

Unfortunately a lot of the people Team XP interacted with (and a fair amount of the team itself) were contractors. This had helped a lot in that it had given us a good scrum master, product owner, project manager and API architect who were willing to go along with the experiment and help refine it. However, this also meant we had an inbuilt time limit on their contribution. Once the mobile app was released, contracts started ending and not being renewed.

Even at the point we were celebrating positive reviews and great customer feedback, the foundations were falling apart. Within the space of a few months the team lost nearly everyone connecting it to the rest of the product organisation. Boards were stacked with tickets that weren't being worked on, sprint goals failed to be defined let alone achieved, and technical quality began to suffer.

What we'd failed to do was create a team that was resilient. It could operate effectively if all of the people it needed were in place, but it couldn't cope with the loss of them. Had we hit the limits of what could be done by giving support and training? Would we always need an expert scrum master to keep things on track?

I realised much later that we'd made a mistake. Sure, we'd carefully explained what a scrum master did, or what a product manager did. The team understood these things, and one of the reasons it fell apart so quickly was it knew when they weren't being done properly. (Or done at all). They expected things in their process to happen, and when the things stopped happening so did the process.

The thing we'd unwittingly done wrong was failing to explain to the team why a scrum master did the things they did. Or why a product owner had the responsibilities they did. Without that knowledge, Team XP came down with a bad case of Enterprisitis: cargo-culting their way through rituals and getting the people they interacted with to perform ineffective facsimiles of them, without receiving any of the benefits. Worse, now those people were gone we didn't have the drive and the pressure for excellence to get the team to care about the why - attempting to bolt it on after the fact wasn't working. All we had was a team who knew things weren't being done properly, but couldn't figure out how to correct them.

Recovery

At the lowest point, we hit some luck. One of the contractors we'd worked with returned for another stint, and suddenly the team had a very good PM again. More importantly, it had a very demanding PM, who wasn't happy for performance to slip from where it had been during the good times. So one week we wheeled out a long-abandoned physical board, stuck every ticket to it (about 3 or 4 deep in places) and had the very blunt conversation:

"Every one of these cards is something our customers or clients want. Some of them have hard deadlines coming up. Some of them will put contracts under threat if not done. Our job is this: safeguard contracts, meet the promises we made to our clients, and make our customers happy. All the time this board isn't clear, we haven't done that."

In the company environment at the time, this was somewhat radical. Teams were generally shielded from blunt commercial reality by their arms-length interaction with "product", which would they would then be further isolated from by their scrum masters. During the first iteration of Team XP we had been a lot more demanding about following good Scrum practice and getting regular feedback, but even so we'd treated stakeholders as an arbitrary group who just turned up and had opinions on software. We'd never given a team such a blunt picture of the commercial reality in which they operated.

This comes back to that point about "assume adulthood". Nobody is going to panic upon discovering that the company they work for needs to make money in order to survive, and sometimes it makes that money by making ambitious promises its competitors can't match. With Team XP, the opposite happened. Being able to deliver on those promises galvanised the team. It didn't need a skilled scrum master or product owner to be effective. All it needed was pressure to deliver products and features. (This will become important. Remember it.)

Did It Work?

The Disney version: yes. If you cut the story off somewhere around mid to late 2018 with a team that was crushing its way through the backlog faster than it could be populated, this was a colossal success story.

But I'm too honest to cut the story off there. And unfortunately, this happy non-ending came at the cost of a few things:

  • Building the team's skills was making them very employable in a competitive market.
  • Our focus on psychological safety and trust in the team meant we ended up with a high level of both technical diversity within the team's products and technical divergence from the rest of the organisation. This is an inevitable consequence of letting people trial things: if those things work, they get adopted.
  • The team's resilience model depended on experiencing continual pressure for features and product. This was a deliberate structure: technical and process choices were made in a framework of, "we do this so we can deliver things for our customers". It seemed a reasonable assumption that there would always be pressure to do new stuff, but it was not a safe one.
  • The way the team operated didn't make sense for most people in the organisation. PMs and product-focused people found it hard to go from command-and-control to explaining why things were being done and the context around them. More engineering-focused people struggled with losing the safety net of not being accountable for their decisions.

This last one was the cause of our first problem. We couldn't scale this model once existing employees were involved. The problem was that to get someone to a point where they could be trained and mentored, they first had to unlearn a whole load of existing hangups. The only reliable way to do this was to immerse them in the Team XP environment until they figured out how it worked, and started to internalise the standards required. But this was a slow process, as it required putting someone in the immediate Team XP ecosystem for several months, and you could only do that with one or two people at a time.

This meant that while we'd made the idea work in one isolated place, we hadn't embedded that operating model and culture across the organisation.

Maybe this would have been fine, but we had another external pressure. The Team XP experiment was operating in the shadow of another similar experiment to improve Totally Anonymous Organisation's software development process by creating a best-in-class team to redevelop a small piece of functionality, before spreading that culture to the rest of the organisation. And this far more expensive and higher-profile attempt than our grassroots team was failing. Very expensively, and at a high profile.

So the org embarked on Transformation Mark Three. The problem in the context of Team XP was that previous transformations were deemed to have failed for two reasons:

  1. Technical diversity creating "islands" of technology nobody else understood how to operate or maintain.
  2. Isolated team processes and cultures which could not be easily shifted to other teams.

Transformation MkIII would be underpinned by a high level of technical and process standardisation. This would introduce two mechanisms that were at direct odds to the way we worked in Team XP: mandating technology choices down to the internal structure of individual solutions, and a gradual but inexorable adoption of SAFe.

Surprisingly, the technical standardisation was not as big a problem as I expected. Team XP largely treated it as they had any other internal organisational constraint. It frustrated them, they wanted to pick at the edges and see what could be compromised on in the name of productivity, but ultimately they were willing to do it if someone would accept the hit to productivity of adapting and rewriting. (This would ultimately become the main sticking point: nobody seemed to want the impact of technical standardisation on a team's productivity measured and inspected over the long term. Strange, that.)

The real killer was SAFe. I know "find the customer on the SAFe chart" is a bit of a meme but seeing it in action was fascinating. Once the team lost their direct connection to someone who really cared about getting features to clients they... did okay. This really surprised me, but it turned out that shorn of external pressure to release features, the team would start to generate that same pressure internally. They would effectively productise internal improvements and logical extensions to functionality, prioritise them according to what brought the most value, and get them released. Sadly technical standardisation never seemed to win the "most value" competition. So stakeholder isolation may have hurt, and it may have shifted focus from external to internal, but it didn't kill the team. That honour went to the release train,

SAFe's release train is predicated on the idea that you want to release with a relatively low maximum frequency, and that items which ought to be part of the definition of done must take place outside of the sprint cycle, out of the team's hands if at all possible. And this is the thing which finally killed off the morale and dedication in Team XP: not only were they developing features isolated from any cultural pressure, those features weren't even getting released due to the train being derailed by other parts of the technical estate completely outside of their control. The pressure to do things was gone, and since that underpinned our resiliency model it wasn't long before everything else started collapsing in turn.

At which point we hit the first problem on that list. Team XP's members were now skilled, independent thinkers who were perfectly capable of getting jobs somewhere else. The company had failed to ensure their salaries kept pace with their personal development, so for most it was an easy decision.

A failure, then?

No. We built a great team, solved a lot of problems, turned a 2-star mobile app into a 4-star one and amassed a huge amount of operational knowledge using Terraform and AWS Lambda in production. We did it for a lot less money and a lot more effectively than most of the concurrent efforts in the same firm. On a personal level, I saw several people go from relatively dull "junior developer" roles to doing interesting things with interesting technology in some good companies - thanks in no small part to their time in Team XP.

If there's any lesson about things not to do, it's that SAFe is a shitshow - but you're here reading my writing so it's a 99% given you already know that.

I guess in terms of a conclusion that's a bit more intellectually rigorous than pointing and laughing at utterly cringeworthy train-based metaphors, it's that mentoring and training your team from humble origins does work, but as I keep banging on you need the whole environment and not just this one silver bullet if you want to actually be effective. And on a meta level remember that even people who write these authoritative-sounding articles and trade on making suspiciously cogent points at industry round table events make a load of mistakes and are still trying to figure it all out as we go along much like anyone else.