The Six Stages of Code Grief
The Six Stages of Code Grief
The Six Stages of Code Grief
Oh, it's only the files that have over 2k lines of code? Hell, I'll take that over what I'm dealing with now. I've got multiple FUNCTIONS that are over 2k lines. >:(
Yeah, I dont see a big problem with files over 2000 lines in some cases, as long as things remain well writrej, organized, abstractd.
One piece of garbage that I'll never touch again hae most functions this size. One was 50,000 lines! Hundreds.of lines of if/else, half of the functions passed the same 60 arguments because he didn't understand classes or even dictionaries, etc etc. And was used heavily.
That's what agentic AI is for! Your OS will figure out by itself what you are doing and weave together a shambolic rococo digital house of cards that will be not just undocumented but utterly incomprehensible.
It's fine, just get a 5GHz CPU with 48 cores, 1TB of DDR5 HBM super RAM, and maybe a few petabytes of storage (in the cloud in a flatpack Docker that runs on a VM), so that you can finally make that button blue.
Shut up and take my venture capital money! And maybe 2/3 of the whole market cap in stock options! /s
Fucken right, get your agentic AI to get in touch with my agentic AI (with wire transfer deets)
ARGH this triggered a bit of PTSD for me....
"We're going to convert these COBOL applications to C#, and you need to test that the new application works exactly the same, including the same bugs as the old application."
"Ok, where's the specifications and test reports of the old COBOL applications?"
"They were lost to time, we don't know where they are."
"Ok, so how are the developers going to write the C# code?"
"They're going to read the COBOL scripts and recreate them into C#, we advise you do the same."
Cue me spending a month trying to decypher the COBOL gobbledigook into inputs and outputs, and write testcases based on that. And after that month was up, and I had delivered my testcases, they told me that my services were no longer needed.
I had delivered my testcases, they told me that my services were no longer needed.
Gee, I wonder how all those specifications and test reports became "lost to time"....
There are no comments in the code
At my last job, I was assigned to a project being run by a straight-out-of-college developer who felt that not only were comments unnecessary, they were actually a "code smell", a sign of professional incompetence on the part of whoever added them. It's an insane philosophy that could only appeal to people who have never had to take over an old codebase.
I kind of get the idea that code should be self-documenting, but at the same time, there's so many crazy business rules that comments are basically a necessity if nothing else other than to explain why in the hell the crazed mess that provides the required functionality for the business rules exists.
Yeah some comments are not useful
python
# returns the value as a string
return str(user.id)
Some comments are
python
# returns the user id as a string because ZenDesk's API throws errors if it gets a number.
# See ticket RA-1037
# See ZenDesk docs: https://etc/
return str(user.id)
That's typically what people who advocate for less/no comments really mean. The code should self explain "what" it does, but if the "why" isn't obvious (i.e. confusing business logic) nobody argues that you shouldn't comment it. That's how I've worked in every company I've been at (and all developers around me) from 50 person start ups to >2k people. It's really common mentality with Ruby developers
Or, it appeals to people that have had had to take over an old codebase where the comments were all lies.
“Code never lies. Comments sometimes do.”
It's funny, the exact same logic applies to method and variable names. There's no compiler that ensures that a method's name accurately describes what the method does or ensures that a variable's name accurately describes what the variable represents. Yet nobody ever says "you shouldn't use descriptive method and variable names because they might be misleading". And this is hardly academic: I can't count the number of times I've run into methods that no longer do what the method name implies they do.
And yet method and variable names are exactly what people mean when they talk about "self-documenting" code.
I don't know that I could have stopped myself from asking whose nephew they are and I'm just a hobbyist
I'll get shit on for suggesting it but this is a great use case for AI: comment the code and generate some basic docs. Even if it's wrong it'll give you a sense of where to start looking for flows.
Problem is, you won't know what the AI screwed up until someone breaks everything.
Something that I'm disproportionately proud of is that my contributions to open source software are a few minor documentation improvements. One of those times, the docs were wrong and it took me ages to figure out how to do the thing I was trying to do. After I solved it, I was annoyed at the documentation being wrong, and fixed it before submitting a pull request.
I've not yet made any code contributions to open source, but there have been a few people on Lemmy who helped me to realise I shouldn't diminish my contribution because good documentation is essential, but often neglected.
The fact that documentation and comments can't "fail" if the underlying code changes is a real problem. I've even worked at places which dictated that comments had to go directly above or even beside (inline) with the code they were explaining, so they would show up in any patches changing the code.
What do you think happened? Yup, people would change code and leave the outdated (and wrong) comment untouched, directly to the right of the code they just changed.
Hell, I was one of those people, so I get how it can happen.
Code was written before git was invented.
https://en.wikipedia.org/wiki/Git
Initial release 7 April 2005; 20 years ago
Oh oh
Bonus frame:
The 2000 line file is one function
That implements a ******* VM in which all of the byte code runs in, and rest of source is just byte code listings that the linker magically gathers into a working program.
What’s a hunter2 VM?
Oh, so you worked with my ex-coworker.
It implemented a database. Giant branching if/for loop.
I didn't even know we were hiring ...
I've been doing this for years at my current job. It has become a masterpiece of refactoring and comments. They weren't even asking the right questions. I'm very proud of myself.
So naturally, I'm about to get fired and have the whole thing redone by AI.
Then re-hired for 3x salary to make it work again, I hope. Or just watch the company/project fail spectacularly
every programmer I've seen who says their code is self documenting writes dogshit code
I think we're all just dogshit but think we're better than the next person, it's like driving. I'm a "comment if there's no way to make it readable" kinda guy, I work with some "comment and don't bother to make it readable because there's comments" people. We all suck. I probably forget to comment on unreadable places sometimes, or overestimate readability he either doesn't update comments so they're out of date or the code is so gibberish that a comment didn't help.
Ideally I guess you comment AND make it readable AND make sure the comments are up to date, but who do you think we are? Superman? And what's the right level of commenting anyway? Probably depends on who is reading them.
Those are rookie numbers. We got functions with 5000+ lines and 20 levels of indentation directly in the user-interaction event handlers :)
Well, that's how you do it!
And if two widgets need to create the same effect, you just copy the 5000 lines around. That's why copy-and-paste was invented.
(It really shouldn't be necessary... but in case somebody still needs it, here's the \s)
This is the right strategy. Storage space costs nothing these days. Why not just clone and go? That's what I always say.
Why, you can just 'inherit' some code by copying a block, pasting it, then making a few small changes. No thinking, no problem.
Ok, I'm off to copy of my code folder for the next release.
That time I started a new job and my first task was "fix bash"...and then I discovered a multi megabyte monstrosity called "bash.sh"
omfg that's over 1 MILLION characters 💀 💀 💀
vomit
I’m dealing with this currently. Thank you all for confirming that I’m not crazy
Then you fix 90% of a problem and get blamed when the rest of 10% doesn't work
"Code IS documentation"
The link is a proxied image link for some reason.
"Documenting the code base will be your first task for the next month to help show us how well you understand the codebase."
Translation: please help us understand our codebase. We're paralyzed by fear.
Cursor please document this codebase
#include "globals.h"
// please help
With the short variable you probably also get shadowing. That's super fun in a new code base.
Or another favourite of mine: The first time I had to edit a perl script at work someone had used a scalar and a hash with the same name. Took me a while to realize that scalars, arrays, and hashes have separate namespaces, and the two things with seemingly the same name were unrelated.
I can live without documentation and comments, but then you've got to write really well-structured, self-documenting code. Which means long variable names (or better: local constants) that describe exactly what's in them, and function names that describe clearly what the function is for, and readable code that shows what it does.
But perhaps expecting that kind of discipline from people who lack the discipline to write documentation, was not entirely realistic.
Allow me to introduce a shit ton of jQuery into all the jsp files you got.
A few years ago I had to port a tool from HTBasic (a proprietary BASIC dialect) to Python. The original source only runs in their proprietary IDE. Of course, no comments whatsoever and a lot of GOTO magic and matrice calculations some of which have no other purpose as to confuse the reader. The variables had only cryptic and meaningless three digit letters. My theory is that they intentionally wrote it in a way that it would be a nightmare to reverse engineer. And they succeeded.
Fork the repo.
Ask an LLM to rename all the variables and add comments and docstrings. Give it your style guide (assuming you have one).
Ask another LLM to check their work.
Done.
Disclaimer: I'm not a programmer, I'm a network engineer who dabbles in automation and scripting. But it seems to me that grunt work like this is what LLMs are really good for.
Also I only use short variable names inside of loops (for i in iterable...). Is that not how it should be done?
i and I are acceptable in small loops. But it depends a lot on the language used. If you're in C or bash maybe it's fine. But if you're in a higher level language like C# you usually have built on functions for iterating over something.
For example you have a list of movies you want to get the rating from, instead of doing
for (i = 0; i < movies.length; i++)
var movie = movies[i]
....
Its often more readable to do
movies.forEach { movie ->
var rating = movie.rating
....
}
Also if you work with tables it can be very helpful to name your iteration variables as row and column.
It's all about making it readable, understandable, and correct. There's no point having comments if you forget to update them when you change the code. And you better make sure the AI comments on the 2000 lines of three letter variables is correct!
Yeah I script more than anything...python, bash, powershell, etc.
Only terrible code I inherit is the stuff I wrote >=3 months ago. I'll keep saying that three months from now, too.
In Go, the recommended convention for variable name length is to be proportional to their scope. It is common to use one or few letters long variables if they are local to a few lines loop or a short function.
Ngl that's like baby levels of nasty code. The real nasty shit is the stuff with pointless abstractions and call chains that make you question your sanity. Stuff that looks like it's only purpose was to burn the clock and show off a niche language feature. Or worse than that even is when the project you inherit has decade old dependencies that have all been forked and patched by the old team
If all I had to worry about was organization and naming I'd be over the moon
And hard casting onto the wrong class because a neat function lives in there (who will detect you did that and treat you a little different because you don't have all the resuired data in that class instance) as a "quick fix"
Even if the abstractions aren't pointless, there's a limit to how many levels of abstraction you can make sense of.
I've seen some projects that are very well engineered, with nice code, good comments, well named variables and functions. But, the levels of abstraction and nesting get so deep that you forget why you were digging by the time you get somewhere relevant.
What's frustrating there is that you can't blame someone else. It's just a limit for how much your brain can contain.
Git commits with message saying “pushing changes” and there are over 50 files with unrelated code in it.
Former coworkers: “oh, these two lines are the same in function x and function y. TIME TO ABSTRACT”
Such DRY
My favorite was an abstract class that called 3 levels in to other classes that then called another implementation of said abstract class.
And people wonder why no one on our team ever got shit done.
Even worse than there being no comments: the code is extensively commented, but its function has drifted from what the comments describe to the point where they are actively misleading.
The good old "signal left when switching to right lane."
I mean sometimes you gotta trick the compiler to get a leg up in runtime.
I was part of project that scoffed at the idea documenting code. Comments were also few and far between. In retrospective, it really seemed like they wanted to give that elitist feel because everything reeked of wanting to keep things under wraps despite everything being done out in the freakin' open.
I literally told my boss that I was just going to rebuild the entire pipeline from the ground up when I took over the codebase. The legacy code is a massive pile of patchwork spaghetti that takes days just to track down where things are happening because someone, in their infinite wisdom, decided to just pass a dictionary around and add/remove shit from it so there is no actual way to find where or when anything is done.
Side-rant:
I rarely write Python code. One reason for that is the lack of type safety.
Whenever I'm automating something and try to use some 3rd party Python library, it feels like there's a good 50/50 chance that front and center in its API is some method that takes a dict of strings. What the fuck. I feel like there's perhaps also something of a cultural difference between users of scripting languages and those of backend languages.
What you described sounds so much worse though holy shit.
Yeah, the new pipeline is based HEAVILY on object inheritance and method/property calls so there is a paper trail for ALL of it. Also using Abstract Base Classes so future developers are forced to adhere to the architecture. It has to be in Python, but I am also trying to use the type hinting as much as humanly possible to force things into something resembling a typed codebase.
FUCK. Triggers me. Just got let go from a place that had this problem and wouldn’t let me make any changes whatsoever. I didn’t even push hard.
I did this once
I was generating a large fake dataset that had to make sense in certain ways. I created a neat thing in C# where you could index a hashmap by the type of model it stored, and it would give you the collection storing that data.
This made obtaining resources for generation trivial
However, it made figuring out the order i needed to generate things an effing nightmare
Of note, a lot of these resource "Pools" depended on other resource Pools, and often times, adding a new Pool dependency to a generator meant more time fiddling with the Pool standup code
The language is COBOL.
I just inherited my first codebase a few months ago. It's like this everywhere and original developer was fired, so what should sometimes be a simple fix turns into a full day of finding what needs to change. Any recommendations on fixing/maintaining code like this or should I just make it the next person's problem?
You're going to want to follow the "campsite rule" everywhere you go, and also sneak in positive refactors into your feature changes (if business is not willing to commit time to improving the maintainability of the codebase).
Read up on good software design principles. I don't know you experience level, but for instance, everyone agrees that appropriate abstraction, and encapsulation make code easier and more enjoyable to work with, and will let you run tests on isolated sections of the code without having to do a full end-to-end testsuite run.
Having tests that you trust, especially if they execute quickly, will increase your "developer velocity" and let you to code fearlessly--knowing that your changes are reasonably safe to deploy. (Bugs and escaped defects will happen, but you just fix them and continue on.)
Good luck!
Figure out what something does, and rename it (with a stupidly verbose name, if you have to). Use the IDE refactor tools to rename all instances of that identifier
[[Wikilinks]] syntax to link between notes, which lets you build a graph structure using your notes as nodesFor example, a function or property, or class might be invoked using Reflection, via a string literal (or even worse, a constructed string). And renaming it can cause a reflective invocation somewhere else random to fail
Or function or operator overloading/overiding doing something bizarre
Or two tightly coupled objects that mutate each other, and expect certain unstated invariants to be held (like, foo() can only be called once, or thingyA.len() must equal thingyB.len()
You can use these to more thoroughly compare behavior between the original and a refactor
I use sentences as variable names sometimes, because I necessarily end up with lots of similar-sounding variables or functions.
List_of_foo_dicts = Get_foo_from_bar_api()
Separate out those "concerns", into their own object/interface, and pass them into the class / function at invocation (Dependency Injection)
cs
public Value? Func(String arg) {
if (arg.IsEmpty()) {
return null;
}
if (this.Bar == null) {
return null;
}
// ...
return new Value();
/// instead of
if (!arg.IsEmpty) {
if (this.Bar != null) {
// ...
return new Value();
}
}
return null;
}
Add comments as you go
You got 3 letters?! Luck!
I worked at a japanese company whose engineers we're former NTT developers. Copypasta (i.e. not using functions), inefficient algos, single-letter var names, remote code execution from code as root, etc. good times!
Hey! This was my first real job. Is Matlab code written by physicists who just recently learned programming.
My first thought immediately was of academia also.
Honest question: would an LLM be able to write useful comments in code like this?
It would probably struggle to see the larger picture. I can see it being used to add comments in self-contained functions though without too much difficulty.
Honest question: would an LLM be able to write useful comments in code like this?
It can be better han nothing, but not really. The LLM faces the same challenge that any competent coder does: neither were present to learn the human, business and organization context when the code was first written.
use the LLM to generate regression tests for the large file, then start refactoring it
Well, I’m the only maintainer for my project, so ha! (I only have myself to blame.)
That just means my boss will have to do all the work. Ha, what an idiot. Wait… aw. 🙁
The next row would be "boss fires you thinking Claude can maintain the codebase."
At least there's a kind of happy ending when we walk past the old boss and don't toss a dollar into his pan-handling hat.
Those are amateur problems, real problems start when you are unable to run it or you don't have source code. Bonus, it's written in the in house language made by developer who left job or died - true story.
Oh God. Story time.
I had an important CICD pipeline that published a dinky little web-thing that was important for customer experience. The first line of the final docker file was from company-node:base. I had all the source code. I had all the docker files. At no point was there ever a container named company-node let alone a tag of base.
The one and only version of this container was on the CICD server.
This was me when I started working with my current full time job.
What a nightmare.
The team lead has spend the last two months writing a permissions library that nobody understands how to use or debug. He wrote it with Cthulhu at his side. Soon not even Cthulhu will understand it.
JavaScript developer in a strongly typed language decoding json into dictionaries with single letter keys.
Yeah, that was a fun job... at least the database tended to have some descriptive column names. They never lined up with the entity they mapped to, but it was better than nothing.
Jesus i worked at exactly this kind of project once. The only other dev was also very hostile and protective of this position. He did not want me there in the slightest. Took about 6 months before we cancelled the contract since this dude was just actively harrassing me in Teams DMs on the daily and he just ignored all my concerns regarding maintainability since "he could understand the code" and i was probably just "not experienced enough".
Don't downplay what this does to your mental health. 5 years of workplaces like this and I'm now starting to see a therapist due to exhaustion disorder symptoms in my goddamn 20s. Take care our there!
So infuriating when you have some dickhead making themselves unfireable by intentionally convoluting the codebase and chasing out any other hire. And even worse when management bought into it and think the guy's an actual irreplaceable genius.
Probably even believes it himself. I hate narcissists.