Programming @programming.dev shape_warrior_t @programming.dev 6d ago

This Overly Long Variable Name Could Have Been a Comment | Jonathan's Blog

jonathan-frere.com This Overly Long Variable Name Could Have Been a Comment | Jonathan's Blog

Here’s a belief I’ve held for a while but only recently been able to put into words: explanatory comments are often easier to understand than explanatory variable or function names. Consider a complicated expression with multiple sub-expressions. This expression is going to be difficult for the next...

Thoughts? It does feel like there's a lot of things you can do in comments that would be impossible or impractical to do in names alone, even outside of using comments as documentation. There's certainly much more information that you can comfortably fit into a comment compared to a name.

One of the comments in the Lobste.rs post that I got this from stuck out to me in particular:

Funny story: the other day I found an old zip among my backups that contained the source code of game that I wrote 23 years ago. I was just learning to code at the time. For some reason that I forgot, I decided to comment almost every single line of that game. There are comments everywhere, even for the most obvious things. Later on, I learned that an excess of comments is actually not considered a good practice. I learned that comments might be a code smell indicating that the code is not very clear. Good code should be so clear, that it doesn’t need comments. So I started to do my best to write clear code and I mostly stopped writing comments. Doing so only for the very few parts that were cryptic or hacky or had a very weird reason for being there.

But then I found this old code full of comments. And I thought it was wonderful. It was so easy to read, so easy to understand. Then I contrasted this with my current hobby project, which I write on an off. I had abandoned it for quite some months and I was struggling to understand my own code. I’ve done my best to write clear code, but I wish I had written more comments.

And this is even worse at work, where I have to spend a ton of time reading code that others wrote. I’m sure the authors did their best to write clear code, but I often find myself scratching my head. I cherish the moment when I find some piece of code with comments explaining things. Why they did certain things, how their high level algorithm works, what does this variable do, why I’m not supposed to make that change that looks like it will simplify things but it will break a corner case.

So, I’m starting to think that this idea that comments are not such a good practice is actually quite bad. I don’t think I can remember ever reading some code and thinking “argh so many comments! so noisy” But, on the other hand, I do find myself often in the situation where I don’t understand things and I wish there were some more comments. Now I’m trying to write comments more liberally, and I think you should do the same.

I guess that’s a generalization of the op’s idea.

28 comments

Clear concise code that reads like documentation is the ideal. Good function and variable names, formatting, and encapsulation play into this. Tests should document and describe the system.

If it still isn't clear what the code is doing, and I'm all out of ideas (or time) for refactoring, a well placed, accurate comment is fine. It needs to be kept up to date like any other artifact in the project.

It's harder to keep comments accurate than code, since code can be executed and tested. I use them sparingly; when I've otherwise failed to write clean code, or the code is just so complex that it needs to be described.

Comments are just another tool in the toolbox. If they add clarity to the situation, by all means, use them.

If you can think of an expressive variable name that lets you skip a comment eg "employeeCount", instead of "e" // number of employees, do that.
The biggest problem with comments is that they can become outdated. If you change code but forget to change comment you introduce very dangerous situation where they become not only not useful, but also misleading.

If you rely on variable names, you've got a single source of truth, one thing to change at a time. Information updates itself.
- You say that like it can't also happen to symbol names.
  
  With symbol names it's trivial to notice and refactor. Comments, not so much.
- People always say this, and I have seen it happen occasionally. But in practice when it happens it's usually fairly obvious and not that confusing (especially with git blame).
  
  The frustration I've experienced from missing comments is several orders of magnitude more than the frustration I've experienced from outdated comments. I think mostly this is an excuse to be lazy and not write comments at all.
- The same thing can be true about variable names, and it is often more time consuming and error prone to change a variable name to match its new meaning than to simply update a comment. Especially if the variable name is public facing through an API or library then there is a huge cost to change the name everywhere that might reference it.
- Symbol names can be outdated as well, but what's worse is they can be flat-out wrong.
  
  Real-life example that I had at my last job:
  
  var isNotX = isX() // somewhere else in the code: var isX = isX() fun isX() { // Code returns isNotX }
  
  That part of the code had a bug and it wasn't clear whether the function should return X or not X (the function was much more complex but returned a boolean).
  
  A comment could have given context and/or be used as parity check for which implementation would have been correct.
  
  This way I had to step through the whole flow just to figure out what it's doing and what it's supposed to do.
  
  Everything comes down to proper function naming. If it wasn't clear what function should return, then it was not named properly.
- Tbf, old comments can also give important context through earlier refactors and help avoid treading the same ground again.
  
  That being said, this is with the assumption that the next dev making a change will add their own comments describing it.
I find vague variable names exhausting. It adds not inconsiderable mental overhead when reading code, at least for me.
I try to write comments whenever what the code isn't obvious on its own. A "never write comments" proponent might argue that you should never write code that isn't obvious on its own, but that doesn't always work in practice

Sometimes you have to write cryptic code for performance reasons

Sometimes you have to deal with unintuitive edge cases

Sometimes you have to work around bugs in 3rd party code

Sometimes you are dealing with a problem that is inherently complex or unintuitive, no matter how you put it in to code
- Sometimes you just need to document the business reason behind what you're doing, regardless of how clear the code might be 😆
  
  //look, I know this makes no fucking sense, but Debbie at MoronCo insisted it worked like this or she wouldn't pay us
- Sometimes you are using language features your team is unfamiliar with.
  
  Had this happen before with pattern matching.
Later on, I learned that an excess of comments is actually not considered a good practice.

Pointless or uninformative comments are not good, regardless of the quantity.

Useful and informative comments are always good, regardless of the quantity.

I learned that comments might be a code smell indicating that the code is not very clear.

When I'm looking at someone else's code, I want to see extensive, descriptive comments.

Good code should be so clear, that it doesn’t need comments.

That hits me like something a teacher tells you in a coding class that turns out to be nonsense when you get to the real world.

I'm not sure how others do it.

As I'm coding, the comments form part of my plan. I write the comments before the code. As I discover I've made incorrect assumptions or poor decisions, I correct the comments with the new plan, then correct the code to match the updated comments.

As a final step in coding, when I feel it is complete, I'll review comments to determine what should remain to help future me if I ever have to dig into it again.

Variable names should be reasonably memorable and make contextual sense, but that's it. That's what they exist for. Don't overload the purpose of anything I'm the code.
- That hits me like something a teacher tells you in a coding class that turns out to be nonsense when you get to the real world.
  
  In a company I work in, we have "no comments policy" for at least ~10 years now and we are not planning to change that. It's not just theory, we work like this in practice and purpose of each part of code is perfectly understandable just from variable names, file names, namespaces, function names.
- // Increment i i++;
  
  Very info. Much useful.
  
  Congratulations, you figured out to do comments the wrong way.
  
  You also figured out how to use a bad, unclear variable name, so should we also stop naming variables with sensible words since it can be done wrong?
  
  External documentation can also be done badly, so let's stop doing that too.
  
  Or what's your point?
  
  Something’s you leave out but let’s say instead you are using some enumerator like in Python over a list of some objects. Sure you can use “i” but what if it’s a list of apples then why not make the iterator “apple”
  
  For Apple in apples
  
  Simple example but the concept can go a long way
I don't like code, that isn't well documented. In fact, this has been my main source of frustration in the past and required the most time to deal with. Thousands of variables, hundreds of thousands of lines of code, how am I supposed to go through it somewhat fast, if there aren't any comments or pieces of documentation that are guiding my understanding? I can't spend half a year to just get a grasp of how the code works.

Comments (as well as docstrings and readmes etc.) provide higher level overviews that can guide you through the code rather quickly, even if it may be longer in terms of words or character count than the lines of code it describes, it may accelerate understanding tremendously. It's just a lot more effort to trace each variable and see what it does and how it interacts with others. This can quickly become exponentially hard to track.

I don't think it's necessary to comment each line of code, except in rare cases or maybe when setting up a class and describing the members and roughly how they're used,, but a few words here and there, at some higher or intermediate level, roughly describing what you want to do, can go a long way for others (and even yourself, when working on a project for several years). It's also already sufficient to just highlight the most important variables in a piece of code, when explaining it. Given that info, this steers your focus when reading the actual code.

"Speaking" variable/function/... names are also very useful. I don't care if it's a long name, as long as it's sufficiently expressive. E.g. "space_info" instead of "si". This helps to understand the code more quickly and reduces backtracking lookups, because you already forgot again what a specific variable does that you haven't seen for a while. My rule of thumb for variable naming: As consice, short and "essence grasping" as possible, but as long as necessary.
Ah yes, one of the major questions of software development: to comment, or not to comment? This is almost as big of a question as tabs vs spaces at this point.

Personally? I don't really care. Make the code readable to whoever needs to be able to read it. If you're working on a team, set the standard with your team. No answer is the universally correct one, nor is any answer going to be eternally the correct one.

Regardless of whether code comments should or shouldn't exist, I'm of the opinion that doc comments should exist for functions at the very minimum. Describe preconditions, postconditions, the expected parameters (and types if needed), etc. I hate seeing undocumented **kwargs in a function, and I'll almost always block a PR on my team if I see one where the valid arguments there are not blatantly obvious from context.
You can write comments, but you can't make your colleagues read them. They don't necessarily have to visit the originating file to read the docs.

Short variable names tend to lead different people to make different kinds of assumptions about the purpose and use of the variable. Those differences in understanding is where a lot of subtle bugs come from, or causes people to hit a dead end.

Just be clear and explicit. Its not gaming; you dont have to care about losing a couple extra frames to type out a few extra characters. Most IDEs have sufficient autocompletes so it's literally not even a problem in many cases.
- You can write comments, but you can’t make your colleagues read them. They don’t necessarily have to visit the originating file to read the docs.
  
  When do you need documentation? When you are down in the code or when you are sitting on the toilet browsing Confluence? If your goal is to make people read the documentation, then the documentation needs to be immediately there where you need it, not in some external thing like Confluence.
  
  Same goes with if your goal is to make people update the documentation. That's much more likely to happen if the documentation is in a comment in the code than when you first have to go hunting to find the correct page in that steaming pile of mess that is confluence.
  
  Just be clear and explicit. Its not gaming; you dont have to care about losing a couple extra frames to type out a few extra characters. Most IDEs have sufficient autocompletes so it’s literally not even a problem in many cases.
  
  You still only got so much screen real estate, and having huge names means that your lines get very long at times, which makes everything really hard to read.
More people should try out literate programming. It is very powerful, especially for complex code and software you want to maintain in the long run.
- Literate programming as an ideal works at very very high level and very very low level. Plumbing code often doesn't benefit from comments at all, and is the usually the most subject to refactoring. Code by amateurs/neophytes is often not gonna be written in such a way that a clear description of the intention or mechanics is achievable by the coder. Unobtainable standard, smh. I like comments with a 'why' at the top and a 'what' at the bottom (of the stack. I'm talking about abstraction layers. Why am I doing this piece of logic in the code you can clearly understand at the top, what the fuck am I doing these weird shenanigans with a fucking red-black tree of all things in this low level generic function)
My rule of thumb is, use short names if the context makes it clear. But do not make names too long and complicated (especially with Python :D). For me having unique names is also important, so I don't get confused. So not only too similar names are bad, especially if they all start like "path_aaa", "path_bbb" and such, then the eye can't distinguish them quickly and clearly. And searching (and maybe replace) without an IDE is easier with unique and descriptive names.

Sometimes its better to come up with a new name, instead adding a modification and make the name longer. This could be in a for loop, where inner loops edit variables and create a variation of it. Instead adding something like "_modified", try to find what the modification is and change from "date" to "now" instead "date_current".
No short variable names, also no non-lewd variable names, life is nothing without suffering.
Nothing wrong with a, i, s and x.

28 comments