architecture

Avoiding growth by accretion

Matt Kimber

25 Apr 2018 • 6 min read

Software has a tendency to grow by accretion - a gradual build-up of material causing what was a small and simple nucleus to become a large, complex object with many layers. It's the same mechanism by which planets form. A small clump of particles attracts other particles, and this new larger lump attracts even more particles, until eventually it has consumed all of the particles available within its sphere of influence.

If software is allowed to continually accrete, a similar situation arises. Every line of code or piece of infrastructure is a liability which takes time and effort to service. Left unchecked, a piece of software reaches a point where it will grow no further, because it has consumed all of the available effort and resources available within its sphere of influence merely to keep it going.

Why accretion happens

Nobody sets out to do this. It's merely that as software developers, we tend to follow a certain path:

We've encountered problem 1… but we've come up with a really innovative way to solve it: Solution A.
We've now got problem 2… but if we add Solution B, that solves it in a really elegant way.
Now we have problem 3… but we've done something really clever in Solution C.

And so on until problem 26 is being solved by Solution Z, then onward beyond that.

What we're not so good at is going back some time around Solution Q and realising that as well as solving problem 16, Solution P would have also solved problems 1,2,4,9,11 and 14, and in hindsight we could have adapted solution B to cope with problems 12,15 and 17, and since the market has changed we no longer have problem 3 or problem 7 anymore. Even if we do, we conclude that unpicking the messy interaction between solutions B, G and H is going to be too much work for delivering one small feature.

Avoiding accretion with iterative minimal design

What needs to happen is a team approach where you're always asking, "does this approach render any of our earlier approaches obsolete?"

Of course, it's easy to suggest that should happen. What's not so easy is how. Which is why I recommend that as a team, every once in a while you get together and answer the following question:

"If we were to build this system again with the aim of making it as small and simple as possible, what would we build?"

This is only one question, but it has a lot of interesting sub-questions:

"What would we throw out that we currently have?"
"What would we combine?"
"What would we solve with a different tool or approach entirely?"
"What requirements would we ask our product owner to push back on, because they add too much complexity for too little value?"

Answer those questions. Make a design for that simple system. Consider how it is different to what you currently have. Then… don't build it.

This is really important. I've been on teams who decided to abandon ship and build that minimal second system from scratch. It ended up accreting just as much mass as the original, only in different ways. Or it was minimalistic, but didn't do any of the things it needed to. Sometimes it would overcome both of these hurdles only to be consistently outperformed by the monster it was supposed to replace; years of performance optimisation driven by production knowledge is powerful stuff. But most often of all, it simply never got finished. Rebuilding from scratch takes time, usually time you don't have. Which means the solution isn't rebuilding, but refactoring.

See, your minimal design isn't a blueprint for construction, it's a blueprint for direction. Whenever you make a change to your software, you look at it and ask, "is my change taking me closer to that design, or further away from it?" For example, if my minimalistic design said that we could do everything in a single graph database, but my proposal is to add another NoSQL tool to the three I already have, I know I probably shouldn't do that. But if I'm proposing adding a graph DB to replace two of the existing data stores, that starts to feel like I'm on the right path.

The next point: this is an iterative process. Every few weeks you produce a new design for the minimal system. Odds are that the changes you made moving toward the first minimal design taught you some new things. Something you thought you could throw out turned out to perform a function nothing else can perform. Something you thought was critical turns out to be pointless. Minimal System Mark Two might look nothing like Minimal System Mark One, and that's fine. Products and their requirements change, so if the minimal representation of a product remained static we'd have to take that as a sign something has gone wrong.

Another benefit is that this design process is a learnable skill. The first time you design a minimal version of your system you'll spend all day arguing to remove two method calls and a logging framework. You may even be so unused to reductive thinking that your "minimal" system has more things in it and greater complexity than your existing one. It doesn't matter: you'll get better, especially if you challenge yourselves as a team to measure and continuously improve how quickly you can create the minimal design, how many things it manages to remove, and how few arguments you have along the way. This doesn't just benefit minimal designs; it builds skills you use in all design activity.

Using lightweight decision logs

As you adapt your system toward a minimal design, you'll often find yourself doing the following:

Removing something only to realise there was an obscure, non-obvious but very important reason why you had it.
Looking at something and thinking, "I have no idea why we have this or even what it's supposed to be doing".

When you let software grow by accretion, one of the pressures you avoid is needing to know why something is there. Assume that all the layers already present make sense in their own way, and add your layer to the top of it. However, if you're regularly moving pieces around you need to know which of them are structural.

There are many solutions to this but the one I like is to keep a lightweight decision record in your source code repository. Whenever you add (or remove) something, add a line to a DECISIONS.md file stating what you did, why you did it, and anything that may be of use to someone who has to revisit that decision. If new information comes to light then go back and edit that line. The bonus of doing this in source control is you don't need a massive document with tracked changes or half a dozen columns to show the last updated date - all that will happen automatically. For any given decision you can find what branch it came from, when it was taken, and if any more information has been added since then.

Thus, for a mature project you might have:

Added ElasticSearch to handle the more complex queries we introduced for STR-197.
Removed QuuxDB. ElasticSearch can also hold the data we've been replicating there.
Removed our hand-built caching layer. We only needed it for QuuxDB queries.

And so on. It only takes a few minutes at most to maintain, but it can save you hours lost to a bad refactoring decision.

The end of accretion

This process of continuously creating designs you're not going to build, moving a little way toward them and then creating a completely new design before you've got there may seem odd. But it's a pattern replicated across many web scale companies - Spotify, Netflix, Amazon and Google do not run on the same code they started with, but also none of them stalled their business for a "stop the world" ground-up rewrite along the way.

It might also seem strange to put so much effort into building things only to later throw them away. But you can be proud of something without needing to keep it around in your codebase forever. Many of the developers I've most respected have been net deleters of code: as they worked with a codebase, they reduced the amount of code in it by identifying where things could be combined, made generic and removed. The benefit was that simplicity increased over time - it was preferable to open an old codebase because it would be leaner and easier to understand than something we'd only just written.

That's the ultimate benefit of avoiding growth by accretion in your software. You avoid reaching the point where you've accumulated so much material, so much liability and time and effort, that there's nothing more available in your sphere of influence and further development is impossibly painful.