Managing technical debt at an agile company

Well this is a painful topic that haunts most companies in one way or another. Basically, technical debt is a what happens when you try to balance between what is needed now from the business perspective i.e. "We need this to start selling stuff and pay your salaries" and on the other side - what is needed from developers in the future to properly maintain the software, i.e. "proper tests, proper CI/CD".

It's a hard thing to balance, and there's no silver bullet how to avoid it. In my opinion, the most important thing - it needs to be a conscious choice. Technical debt should never arise out of recklessness, it needs to either:

Be a decision, with the full knowledge of the consequences
Arise after your knowledge of the domain gets better - you understand, in time, that you've written software in a way that needs to be refactored.

As an agile agency, we build different kinds of software, we always keep in mind that software is a constantly evolving creature. Product owners always want new features to be added, infrastructure needs to scale, requirements change, people leave, knowledge gets lost, and needs refreshing.

This is normal, each client has their own vision of the product that needs to adapt to the ever-changing market needs. One day you've built a web app that has email and password login with a single personalized page, and the next year the users want a mobile app with Apple Login, with some tightly integrated loyalty system. Times changes, users change, the software also needs to change.

But what is often forgotten is that as time goes on, features get added, bugs get fixed - the software becomes more complex. At one point, you will step back and take a look at the software and realize that this is becoming harder and hard to maintain.

Build what manager tells you — Don't be like this.

And as we are talking about real-world problems, and not theoretical problems - sometimes a reason the software gets complex is not a good reason, for example, a feature gets added because "the CEO asked for this personally and this needs to be added in the next week". It's bad, but it happens. Over time this will result in technical debt that will need to be eventually resolved. And of course, this will cost time and money.

Tips & Tricks how to manage the debt

There are quite a few things you can do to deal with/against the technical debt as a technical manager or as a team lead.

Take a step back and a hard look at your codebase

First of all, stop growing the debt and start looking for potential problems. Consciously step back and try to see which modules of the project are the most complex, and which modules are actively developed. Visualization tools help here quite a lot, I would suggest CodeScene which is an all-in-one solution to analyze code health. Then you can select proper functions for refactoring. Do it in small steps, no need to do a full-blown refactoring of the whole application. A method a week will bring you results in one year.

Indicators of technical debt

If you feel overwhelmed and don't know where to start, follow these steps:

Check for warnings or outdated packages during the build - this is a hygiene factor that might bring you on the path of finding some good candidates for refactoring.
Check for code with no documentation or high cyclical complexity. Those are good candidates to take a look at and improve.
Find out which module has too much responsibility and a high "Change Coupling" level. Which means if you do changes in one part of the project, the other module also needs to be updated. If there are such modules - good idea to have a look at them.

A good story regarding the outdated packages - we had a production service running smoothly for many months, and we need to migrate it to a different data center. When we tried building it from scratch it failed with weird errors. Turns out some requirements had greater or equal comparison, which led to those packages being updated during a build. We spent about 2-3 days trying to figure out which packages were updated and how this affected our codebase.

CodeScene Clusters — Circle represents a module, the larger it is, the more complex it is, the deeper the color the more developers work on it. CodeScene

Negotiation with the business

It's hard to explain why it matters to not cut corners, but nevertheless, we always need to have the business be aware of the costs. If you can quantify it, you can argue that this short-term gain of $$ will result in $$$ costs in the long run. For example, if your manager comes in and says "The client wants the app to support Social Login from Facebook by tomorrow", you understand that the request is unreasonable and that if your speed it up so much, you will probably not be able to extend the API in the future.

A good argument, in this case, would be "If we do it this fast - we skip tests, we skip architecting, we skip documentation, we skip QA - it will cost the client quadruple hours when he will want to add other social logins, and there will be problems that will result in extra costs after release". And it's then the responsibility of the Manager to make sure the client understands the real costs of this implementation and sees the reason to do this the proper way.

Invest in the quality of development processes

Make sure your development process is top-notch. What this means that even if you estimate everything properly, formulate the requirements with granular details - in the end, you can get architectural issues and untestable code if you don't have proper processes on the development side.

One good example is the theory of broken windows. If someone starts slacking of, then another developer starts doing the same and in the end, everyone is following bad practices. This should be avoided.

You should try to get everyone on the same page. Make sure you have systems in place that keep people accountable. It means knowing what people are working on and how they are working on it.

Old Code

I'm sure every one of us who ever joined a new company, took a look at some part of the codebase and was like "why is it written like this", "I'm sure I could do better", "This code is legacy code, so it's technical debt and needs to be rewritten". But that's a misconception that took me a few years to understand. Old code is good code. Why? Because only good code grows old - bad code is always being rewritten, improved, worked upon. But old code does exactly what's needed and it's been doing that for years.

Of course, you might think that your unfamiliarity with the code makes the code harder to understand. That's true, but it doesn't make the code bad. Something that your predecessor wrote a few years ago before he left - is not technical debt until the requirements arise to work on that code. Let it run, focus on other low-hanging fruits before you touch that old manuscript of bizarre functions.

Engineering mistakes

This one is tricky to explain, but eventually, an engineering mistake IS part of technical debt. But technical debt IS NOT part of engineering mistakes. You should treat those separately. If a developer deliberately adds technical debt to the codebase - then he is a bad developer. If a developer makes a bad decision to use an experimental library or write code in some spaghetti paradigm - it needs to be discussed separately rather than part of technical debt discussions.

Strict Definition of Done

This is one of my favorites that I think can help a lot in avoiding technical debt. So when is a feature done? A feature is done only when it behaves as the PO expects it AND when a "release checklist" is finished. The "Release Checklist" can be a set of bullet points like "Is every function well documented", "Does every function have at least X unit tests", "Was the code reviewed by at least X developers". So not only does the feature need to be done from the business perspective it also needs to be done from the development perspective.

Low hanging fruits like comments, tests, documentation should not be postponed and should be fixed immediately.

When a new developer joins the company it's understandable that he has habits from his previous and also the code style that might not fit into your codebase. Unless you want a zoo of different people writing code differently, you should have a style guide that brings the code to a common denominator and makes it easier to understand for everyone.

Finding time to work on the problems

It's clear that during the sprint planning a POs job is to squeeze as many features as possible into the sprint so the team velocity is high. What the PO needs to understand that the velocity will get slower and slower unless the technical debt is being worked on constantly. So making sure that the old stuff gets improved - is also in the interest of the PO. You can agree that no release will be made with ONLY features, there needs to be some kind of other improvements to the codebase along with it.

Allocate resources for solving problems like refactoring, writing tests, and documenting. I recommend creating a separate tag in your project management tool (I strongly recommend Linear.app) in order to keep these issues visible and top of mind for everyone. If the focus of the sprint lies on velocity alone, technical debt will keep increasing and slow down the whole project.

Track metrics overtime

I wouldn't make it public that you're tracking some kind of metrics regarding the code, because the developers have the tendency to optimize for the constraints that they're given.

Something I take a look at:

How many new bugs are related to the New Features vs Old Features
Code Complexity / Code Coupling
Code Ownership - Who has the most ownership of the code
Transfer of knowledge / Bus Factor - This is the knowledge being distributed across the team properly.

So is technical debt inherently bad?

No it isn't. You need to take a look at the product as a whole to understand that. Let me visualize it for you with a fictional story that a friend of mine told me.

I worked for a very cool person - a developer in the past. We had zero technical debt. We wrote the code beautifully, we spent hours perfecting out each line so the architecture is future-proof. We implemented Machine Learning instead of simple if statements, we added Message Queues instead of doing synchronized requests because we need to be prepared for the billion clients that we are going to have.

We were working on this product for 3 years - new competition arose while we were perfecting every line. And then one day, we ran out of money - even before our perfect product hit the market. And even if we did release, who knows if the competition didn't outperform us already and if the client even wanted to use our product. We never showed it to anyone - why would we, it's not perfect yet.

When someone asks you why technical debt is bad

And then there's the other side - where he worked for a business-oriented person, instead of code-oriented person. The code was bad, some stuff was being done manually. After 3 months he pushed us to release an alpha version to the clients - full of bugs but mostly working. People started using the software and giving feedback. We focused on the most important parts and fixed the crucial problems first, we became fully sustainable after 3 more months and expanded the development team which is now focusing on improving all the other parts of the code, essentially fixing the technical debt.

Now if you take a look at the product - it's definitely worth it starting with bad code, taking shortcuts, and releasing early. As long as you understand that you're taking on the responsibility to fix those problems in the future. In the real world, it's mostly irrelevant how bad the code is until it really becomes a problem.

Reactions

Hot! The last couple of years I've been writing about CTO / Tech lead job. I've compiled all my knowledge into a printable PDF. I called it "256 Pages of No Bullshit Guide for CTOs". So if you're interested, take a look.

Hot! If you're a software engineer looking for a job, I started a Roast my Resume service, where I record a personalized video of me "roasting" your CV, which basically means taking a hard look at your resume as a CTO and commenting on all the good and the bad parts.