If there is one thing I wish this industry would understand better, it’s the cost of adding complexity.
We train our analysts to describe what to build. We teach our managers to budget the cost of building features. We give them zero tools to understand the cost of maintaining such added complexity.
Software developers and architects are not doing much better. They are often the driver of this incremental “maintenance creep”. Adding a custom caching layer. Going full-blown microservices.
Along the line, all of us keep making it more difficult and expensive to maintain.
We are used to estimating Story Points as a measure of “building cost”, but we have no language for operational costs. If we could come up with a system of “Maintenance Points”, we might make better decisions.
Let’s say we want to get a car. We’re mainly going to use it for our 40km daily commute. But what if we would have to move a lot of stuff? We would also like to use it to go on holiday with 6 people. Our analysts go to work and they decide that we need a van. It’s more expensive than a regular car, but hey, it’s an investment and it will satisfy all our demands.
Of course, it consumes more fuel. Of course, we can’t park it in our garage. Of course, maintenance is more expensive. Of course, we never go on that holiday with 6.
We understand this problem and we can all come up with a good solution: buy a small, cheaper car and rent a van when you want to go on holiday.
But that means saying No during the build phase and saying No requires good arguments.
Let’s get a bit more technical. Let’s say we want to set up a cache. Let’s say we want it to be long-lived. Let’s say we also want it to invalidate in real-time.
This gives us all kinds of added complexity. Caching strategies can get notoriously difficult and you’ll end up with all kinds of edge-cases that will require specific features.
We put our analysts to work and they come back with a perfect custom designed event-driven real-time caching system. It will be expensive to build, but hey, it’s an investment and it will satisfy all our demands.
Of course, it will serve stale data. Of course, we will “forget” to implement it in the next part of the system. Of course, the events will get out of sync. Of course, we never need a real-time solution.
Keeping down complexity is a job for all stakeholders of a project, but it’s mainly a task for software developers. They are the experts, they know what makes a design convoluted. It’s up to them to explain why this feature will increase operational costs and to provide alternatives. It might feel weird to have developers think about TCO, but ask yourself this: Who else can do it? Who else has the technical insight to know how difficult maintaining a system will be?
It’s also a job for management to invite this feedback and take it to heart. Most project managers are hired to look only at building cost. There is no incentive for them to say No to a feature that’s quick to build but expensive to run. Yet it’s vital that they do push back.
Every project kick-off presentation has a big slide about Total Cost of Ownership. The numbers are pulled out of thin air, but at least the font is huge. TCO is a vital part of any software project and it’s treated as an afterthought.
We can do better.