On Architecture and Technical Debt

My intention here is to discuss the differences between architecture, technical debt, and the inextricable link between them. Managing technical debt can be an extraordinarily difficult problem for the best organizations, but I hope to lay out some methods to do so through documentation, and through a commitment to reviewing and updating documentation.

In a modern technology organization, engineers are tasked with building and managing many things. Those can be code that you ship; or the code that ships code; or the infrastructure on which you run the code. This can include third-party services–really, anything you use to help build, test, deploy, or monitor a product. Let’s call this collection of things architecture.

We want the software we build to last a long time, and carry with it a minimum of fuss. If we can accomplish this task, we will free ourselves to spend more time writing new code, and we can do so with confidence. But what happens when systems break down and pain points emerge?

We have a term to describe that situation–technical debt–and it’s influenced by several things: by its age, and the things you’ve learned since the code was written; by its design, and the assumptions made at the time of its design. Technical debt is a function of architecture, given time:

D = A(t)

That function, A, hides a lot of complexity. For any given unit of architecture, the largest factor in its complexity is its ability to scale to current needs. That ability to scale is influenced by that unit’s design, but is also influenced by the environment surrounding it: best practices can change; market conditions can change in unexpected ways; and so forth.

Consequently, the amount of observable debt can vary wildly from one area to another: in one situation, small changes may be acceptable; in another, only a rewrite will suffice. This variability is the principal reason that technical debt remains a difficult concept to grasp.

We know that debt is bad, but we don’t know how bad; we know it’s there, but we don’t know where. Leadership may view technical debt as a kind of bogeyman, lurking in the shadows; engineers often have much better knowledge of debt, because they work with the code behind it every day–but just as often, they lack the agency to do much about it.

If the architecture that engineers write is important, so is the knowledge of this architecture. What documentation of your architecture do you have or use? Would a new engineer know where to find it? How much of your documentation is communal knowledge shared directly from one engineer to another?

Written documentation can be read; it can be reasoned about. Everyone in the team can have equal access to it. You can organize it holistically, view dependencies to it and from it. You can use all of this knowledge to find problems within systems and, best of all, fix them.

There’s just one problem: Documentation is out of date the moment it’s written. Documentation is static, and the architecture it describes is changing daily as engineers commit code. Certainly, some documentation can be generated based on your source code–for example, where things are, in what files, and so forth–but only so much information can be conveyed by such generation, and often that information lacks context.

To make documentation matter, you need to wrap it in some process to keep it updated, and to keep its knowledge spread amongst the team. In this way, your documentation becomes a living part of your organization, and a reliable source of knowledge for both new and existing team members. The degree to which it falls out of date is limited only by the cadence you choose for its review.

A review process also provides chances for the team not only to discuss, but also discover, problem areas. Team members can call out pain points, voice concerns, and in general allow action items to be produced, offering two things you previously lacked: agency on the team to resolve debt, and concrete evidence of that debt to leadership, who then have visibility into technical debt and the ability to direct fixes for it.

Documentation, and a process to regularly update it, are not silver bullets. If your formerly-popular frontend framework is archived and left in read-only mode tomorrow, you may find yourself in need of a rewrite despite your best efforts. Poor design choices made in the past cannot be wished away. Poor documentation, and a lack of commitment to its quality, can prevent its use in continually improving systems. But a lack of documentation can only ensure that a cure for ailing architecture remains out of reach.

Small steps toward documentation, however–with a process to ensure that documentation will not fall out of date–can provide short-term relief to the team and confidence to everyone that technical debt is observable and fixable.