Linking parts of the codebase such that changing one forces reviewing the other ?
Suppose we have a large to-do task manager app with many features. Say we have an entity, which is the task, and it has certain fields like: title, description, deadline, sub-tasks, dependencies, etc. This entity is used in many parts of our codebase.
Suppose we decided to modify this entity, either by modifying, removing, or adding a field. We may have to change most if not all of the code that deals with this entity. How can we do this in a way that protects us from errors and makes maintenance easy?
Bear in mind, this is just an example. The entity may be something more low-key, such as a logged user event in analytics, or a backend API endpoint being used in the frontend, etc.
Potential Solutions
Searching
One way people do this already is by just searching the entity across the codebase. This is not scalable, and not always accurate. You may get a lot of false positives, and some parts of the code may use the entity without using it by name directly.
Importing
Defining the entity in one central place, and importing it everywhere it is used. This will create an error if a deleted field remains in use, but it will not help us when, say, adding a new field and making sure it is used properly everywhere the entity is being used
so what can be done to solve this? plus points if the approach is compatible with Functional Programming
Automated Tests and CICD
Tests can discover these types of issues with high accuracy and precision. The downside is... Well tests have to be written. This requires developers to be proactive, and writing and maintaining tests is non-trivial and needs expensive developer time. It is also quite easy and common to write bad tests that give false positives.
Automates tests definitely work, but the downside is it requires the developer to be proactive, and the effort put in writing tests is non-trivial (and its easy and common for developers to write bad tests that give false positives).
But no matter what you do, you're asking for something that will need to be manually done. Your tests should be done, and they should be reviewed. It will solve the problem you have and many more.
Having unit and automated integration tests backed by both requirements and high code coverage. As a lead I can verify that not only you made the change to support the requirements though these unit tests but also a really quick verification that other functionality may not have changed based on your large scale change. Helps a lot for significant refactoring too
I addressed this in some of my other replies, but I do not believe unit tests are a good solution here. It's way too common for developers to write tests that give false positives, and its very common for organizations to have low or insufficent coverage due to the higher cost associated with testing.
An adequate test coverage should help you with these kinds of errors. Your tests should at least somehow fail if you make something incompatible. Also using the tools of your IDE will help you with refactoring.
Testing definitely works, but the downside is it requires the developer to be proactive, and the effort put in writing tests is non-trivial (and it's easy and common for developers to write bad tests that give false positives).
That's why test coverage exists and needs to be a mandated item.
I have absolutely no patience for developers unwilling to make good code. I don't give a shit if it takes a while, bad code means vulnerabilities means another fucking data breach. If you as a developer don't want to do what it takes to make good code, then quit and find a new fucking career.
There is a whole field, that looks a bit like religion to me, about how to test right.
I can tell you from experience that testing is a tool that can give confidence. There are a few new tools that can help. Mutation testing is one I know that can find bad tests.
Integration tests can help find the most egregious errors that make your application crash.
Not every getter needs a test but using unit tests while developing a feature can even save time because you don't have to start the app and get to the point where the change happens and test by hand.
A review can find some errors but human brains are not compilers it is hard to miss errors and the more you add to a review the easier it can get lost. The reviews can mostly help make sure that the code is more in line with the times style and that more than one person knows about the changes.
You can't find all mistakes all the time. That's why it is very important to have a strategy to avert the worse and revert errors. If you develop a web app: backups, rolling deployments, revert procedures. And make sure everyone know how and try it at least once. These procedures can fail. Refine them trough failure.
That is my experience from working in the field for a while. No tests is bad. Too many tests is a hassle. There will always be errors. Be prepared.
A factory pattern helps. By making a dedicated class that handles the creation and distribution of Task entities, that's at least one point of failure that's than centralised.
It depends on the language, since you mentioned you don't want to do manual testing -
Start with a mono-repo, as in, 1 repo where you add every other repo as a git submodule
Then, every time something changes you run that repo though the build server, and validate that it at least compiles.
If it compiles, you can go a step further, build something that detects changes, for example by parsing the syntax tree of everything changed, then check the syntax tree of the entire project which other methods / objects might be affected. In dotnet you'd do this with a Roslyn Analyzer
Most languages have an IDE which will manage the import of that object and when you rename incorrectly, it'll flag it up. If you're calling an incorrect function or variable, it'll flag it etc. Many will have refactoring tools so when you rename something through this, it'll rename all instances of that.
This is related to what I discussed in the "Searching" section. Entity fields may not be necessarily imported, so they would not be caught in this. Say you're using that field's name in a SQL query, HTTP or GraphQL request / query. This may also not be caught by IDE.
This also would not cover the case where a field is modified without necessarily changing its name, or a new field is added and now the code using that entity is not using the field.
Usually when you change your database structure, you would change the object that this is mapped into. If you were to change one without the other, that would be a monumental developer oversight. Adding a field without using it in many frameworks wouldn't necessarily break it, so it wouldn't be a bad change per se.
Any change you make to persistence should reflect as a bare minimum, the object data gets mapped into. This would likely be part of the same branch, and you probably shouldn't merge it until it's complete.
You're looking for tooling to protect you from human errors, and nothing is going to do that. It's like asking, how can I stop myself from choking when eating. You just know to chew. If this isn't obvious, it's a good lesson in development. Make one change at a time and make it right. Don't rush off to presentation changes or logic changes until your persistence changes are complete. When you get into habits like this, it becomes steady, methodical and structured. Rushing is the best way to make mistakes and spend more time fixing them. Less haste, more speed.
For example, if I add a new field. I'd write the SQL, run it, populate a value, get that value and test it. Then I'd move on to the object mapping. I'd load it into the code, and get a bare minimum debug out to see it was loaded out, etc. etc.. Small tweak, test and confirm, small change, test and confirm. Every step is validated so when it doesn't work, you know why, you don't guess.
If it's a microservice architecture using something like openapi and code generators could be a solution. Then the proper classes / types are created during the build step.
Does not avoid the fields being unused, or service B using an older version before being rebuild.
The approach would be similar as a library, but works across different languages while changing the definition only on one place.