Is it bad practice to not delete redundant files right away from VCS but instead mark them as "To be deleted" with comments first?

Question

I wanted to know if the way I deal with source files that need to be deleted from version control could be regarded as bad practice.

I want to explain it to you based on that example:

I recently got very angry because I had to tediously sort out Java classes in a programme that were basically dead code however it was nowhere documented and also not commented in those Java classes. Of course they needed to be deleted but before I delete such redundant stuff I have a - some may say strange - habit:

I do not delete such redundant files immediately via SVN->Delete (replace with delete command of your version control system of choice) but instead put comments in those files (I refer both at the head and at the footer) that they are going to be deleted + my name + the date and also - more importantly - WHY THEY ARE DELETED (in my case, because they were dead, confusing code). Then I save and commit them to version control. Next time when I have to commit/check in something in the project to version control, I press SVN->Delete and then they are eventually deleted in Version Control - still of course restorable through revisions though and this is why I adopted that habit.

Why doing this instead of deleting them right away?

My reason is, that I want to have explicit markers at least in the last revision in which those redundant files existed, why they deserved to be deleted. If I delete them right away, they are deleted but is nowhere documented why they were deleted. I want to avoid a typical scenario like this:

"Hmm... why were those files deleted? I did work fine before." (Presses 'revert' -> guy who reverted then is gone forever or not available in the next weeks and the next assignee has to find out tediously like me what those files are about)

But don't you note why those files were deleted in the commit messages?

Of course I do but a commit message is sometimes not read by colleagues. It is not a typical situation that when you try to understand the (in my case dead) code that you first check the Version control log with all the associated commit messages. Instead of crawling through the log, a colleague can see right away that this file is useless. It saves her/his time and she/he knows that this file was probably was restored for bad (or at least it raises a question.

You're essentially trying to replicate the job of your version control system directly in the file. Doing such a thing would definitely raise a flag in my head. If your colleagues don't read the commit messages *and* they resurrect a file that was rightfully deleted *and* it passes code reviewing, there's definitely something wrong in your team, and it's a great opportunity to teach them better. — Vincent Savard, Jun 14 '17 at 16:07
@GregBurghardt This seems like a good question about a bad idea. Perhaps people are downvoting based on the second part of that (when, IMO, they should be upvoting for the first)? — Ben Aaronson, Jun 14 '17 at 17:16
@BenAaronson: Which is funny because the "Hot META Posts" in the right column of this site (currently) links to [Downvoting because we don't agree with asker's approach or logic](https://softwareengineering.meta.stackexchange.com/questions/8547/downvoting-because-we-dont-agree-with-askers-approach-or-logic?cb=1), which appears to be exactly what's going on here. — Greg Burghardt, Jun 14 '17 at 17:41
"Next time when I have to commit/check in something in the project to version control, I press SVN->Delete" Are you saying you delete them in a (potentially) entirely unrelated commit? — Kevin, Jun 14 '17 at 19:03
*Of course I do but a commit message is sometimes not read by colleagues.* - if they're not checking the commit message, it's safe to assume they aren't interested in why the file was removed... otherwise they should check the commit message. — Ant P, Jun 14 '17 at 20:55
"Next time when I have to commit/check in something in the project to version control," are you suggesting you do that actual delete in an unrelated commit of other things? The 2 stage delete is unusual but combining unrelated commits is indeed bad practice — Richard Tingle, Jun 15 '17 at 06:42
@Kevin Please specify what an unrelated commit is, it is not clear to me. — Bruder Lustig, Jun 15 '17 at 09:18
@BruderLustig A commit should be "atomic". In other words, it should be as small as possible and only about *one thing*. As soon as you can say I did *x* and *y* in this commit, it should have been two (or more) commits (in most cases). Since every commit is about a single change you made, you seem to suggest deleting these files in a commit of which are about a change not related to the deleting. — Jasper, Jun 15 '17 at 09:24
The closing argument in the OP is nonsensical. The process is -> Commit -> Delete file -> Commit. Any colleagues curious about the deleted file(s) are _far_ more likely to peruse the SVN log messages than they are to check out an outdated revision of the code so that they can resurrect a deleted file and trawl through it for comments explaining why it was deleted. The commit message is _exactly_ the correct solution here. — aroth, Jun 15 '17 at 11:16
@Jasper Thanks for the hints. I agree with you that commits should be small as possible and depending on the number (many or few files) of the files that are going to be deleted I just do a commit for the deletion of the files only if tgey were many (I should have been more precise in my post). However in this case there were just four classes and I saw no problem in adding them to another commit, of course adding " + deleted files that contained only dead code." to the rest of tge commit message. — Bruder Lustig, Jun 15 '17 at 11:34
@BruderLustig what I mean (and it looks like Richard Tingle is saying the same thing) is, do you check in the change with the "delete notice" and then immediately check in the change deleting the files? The way it's written it looks like you check in the text change, but don't check in the deletion in until you have other changes. — Kevin, Jun 15 '17 at 16:47
If I want to know why a file disappeared from my VCS checkout, I will look in the change history _first_, and if that doesn't explain things, I will read the discussion that happened when the deletion was reviewed. (If you don't have a code review process that every change goes through, you have bigger problems.) And if _that_ is still unenlightening I will talk to whoever deleted the file. I might look at the former contents of the file to try to discover what it used to do, but not for an explanation of the deletion. — zwol, Jun 15 '17 at 17:43
@aroth and to add to that. Changes are, people might actually use the (not so) "deleted" file, without actually having to open it. — Jeroen, Jun 16 '17 at 07:41
@VincentSavard "it's a great opportunity to teach them better" if that works, you have a great team I do no more dare to dream of. — Bernhard Hiller, Jun 16 '17 at 10:16
Each time I see a "TO DELETE LATER", whatever it is and wherever it appears, I think: "If the guy who make this note did not dare to delete this, why should I -knowing usually less about the issue than that guy- take that risk?" — SJuan76, Jun 16 '17 at 10:35
Couldn't you commit twice at one time, once with comments in the to-be-deleted files, then again immediately, this time with the files deleted? — trysis, Jun 16 '17 at 15:38
This is similar to people commenting out code, instead of removing it and saying "what if this piece of code will be needed again". I just don't get it, why some people work with a VCS and somehow forget about its purpose. :| — BartoszKP, Jun 17 '17 at 10:30

score 110 · Accepted Answer · answered Jun 14 '17 at 16:54

110

The problem with adding a comment to a file that it should be deleted, instead of deleting it in source control and putting the explanation there, is the assumption that if developers do not read commit messages that they will surely read comments in source code.

From an outsider's perspective, this methodology seems to be rooted in a very conservative view of source control.

"What if I delete this unused file and then somebody needs it?" someone might ask.

You are using source control. Revert the change, or better yet talk to the person who deleted the file (communicate).

"What if I delete the dead file, then somebody starts using it again and they make changes?" someone else might ask.

Again, you are using source control. You'll get a merge conflict that a person must resolve. The answer here, as with the last question, is to communicate with your teammates.

If you really doubt a file should be removed, communicate before deleting it from source control. Maybe it only recently stopped being used, but an upcoming feature might require it. You don't know that, but one of the other developers might.

If it should be removed, remove it. Cut the fat out of the code base.

If you made an "oopsie" and you actually need the file, remember that you are using source control so you can recover the file.

Vincent Savard, in a comment on the question, said:

... If your colleagues don't read the commit messages and they resurrect a file that was rightfully deleted and it passes code reviewing, there's definitely something wrong in your team, and it's a great opportunity to teach them better.

This is sound advice. Code reviews should be catching this kind of thing. Developers need to be consulting commit messages when an unexpected change is made to a file, or a file is removed or renamed.

If the commit messages don't tell the story, then developers also need to be writing better commit messages.

Being afraid to delete code or delete files is indicative of a deeper, systemic problem with the process:

Lack of accurate code reviews
Lack of understanding about how source control works
Lack of team communication
Poor commit messages on the part of developers

These are the problems to address, so you don't feel like you are throwing rocks in a glass house when you delete code or files.

answered Jun 14 '17 at 16:54

Greg Burghardt

34,276
8
63
114

3

"assumption that if developers do not read commit messages that they will surely read comments in source code." I think this is a fairly good assumption. A developer changing a file needs to actually look at the file to accomplish the task, while there is nothing that forces one to look at the source control history. – AShelly Jun 14 '17 at 21:13
7

@AShelly: a developer task is rarely to change a comment. Tasks involve changing code, which is what the developer focuses on - not the comments. – Greg Burghardt Jun 14 '17 at 22:23
21

@AShelly Just because they have to open a file doesn't mean they'll read a particular comment at the top. The target audience for this change is the most oblivious kind of developer, the kind who thinks their needs are the only needs and everything should revolve around them. These are unlikely to read your polite comment. Worse, they're likely to read it, and then simply revert to the next oldest version. – Cort Ammon Jun 14 '17 at 22:48
5

@AShelly - as an example I commonly open files to a location. In VS I can open directly to a method using the dropdowns at the top or "go to definition" context menu. I would never see the top of the file. Also in Sublime Text I frequently open to a line Their open dialog users.rb:123 would open users.rb directly to line 123. Many code editors have very similar options. – coteyr Jun 15 '17 at 18:19
2

In addition to the above, the more complex it is to do cleanup tasks like this, the less likely people are to do them regularly. If you can get away with doing it simpler, it lowers the "activation energy" for you and everyone else. This is especially important if you are trying to encourage a change of practice for other team members. The more complex the procedure is, the harder it is to get buy-in. – xdhmoore Jun 16 '17 at 06:15
2

@AShelly Also, if (as the OP assumes) the "resurrecter" of the deleted file can't be trusted to read the commit-message explaining why the file was deleted, there's no guarantee they'll resurrect the version with the comments in -- they may resurrect an older version without the comments. – TripeHound Jun 16 '17 at 07:16
I am not disagreeing with the advice to simply delete the file. However, I still maintain that the assumption that commit messages are a better communication tool than the code itself is a bad assumption. Most devs I've worked with look at commit history only _after_ reading the source fails. – AShelly Jun 16 '17 at 16:54
2

In OP's case, a viable compromise would be to replace the whole body of the code with "DELETED Because: Reasons", check that in, then immediately follow with the delete commit. – AShelly Jun 16 '17 at 16:56
@AShelly wow, thanks for the good hint! If the whole body of the file is not there anymore in the rstored file, than no one is tempted to use it because it can't be used (at least not in the revision that was restored). – Bruder Lustig Jun 17 '17 at 12:12
You must know that a deleted file did exist once and where, in order to recover it. That is much easier to do if the file still exist in the file system. – Thorbjørn Ravn Andersen Jun 18 '17 at 12:42
Small addition: Not only are you creating additional overhead for yourself, i.e. you loose time with this procedure, you also risk creating overhead for others. You increase the risk that someone is attempting to modify the file while it is being deleted (you marked it as such, but it's still there) and then at some point you delete that file. Either before or after they did commit their changes. If before they've wasted time. If after, you either have to check again for calls and remove them or you may break your build by simply removing the file. – Frank Hopkins Aug 06 '17 at 12:09

JacquesB · Answer 2 · 2017-06-15T05:06:11.543

105

Yes it is bad practice.

You should put the explanation for the deletion in the commit message when you commit the deletion of the files.

Comments in source files should explain the code as it currently looks. Commit messages should explain why the changes in the commit were made, so the commit history on the file explains its history.

Writing comments explaining changes directly in the source itself will just make the code a lot harder to follow. Think about it: If you comment in the file every time you change or delete code, soon the files will be swamped with comments about the change history.

edited Jun 15 '17 at 05:06

answered Jun 14 '17 at 16:06

JacquesB

57,310
21
127
176

6

Perhaps add the answer to the question: Is it bad practice? Yes, because.... – Juan Carlos Coto Jun 14 '17 at 20:54
1

I sometimes do the same (as OP) because it's easier to uncomment than to revert. In rare cases even if code compiles and passes automatic testing, there is a reason for existence of that code I overlooked that manual tester would spot. In this rare scenario, instead of going through major pain or reverting things several commits later, I just uncomment the code back (and revisit how it can be improved) – Andrew Savinykh Jun 14 '17 at 22:53
2

@AndrewSavinykh That's different, you are commenting code you are still not sure you want to delete. – Stop harming Monica Jun 15 '17 at 06:39
> _"If you comment in the file every time you change or delete code, soon the files will be swamped with comments about the change history."_ - this is exactly what CVS `$Log$` did upon export. – ivan_pozdeev Jun 15 '17 at 17:28
6

@AndrewSavinykh “major pain or reverting things several commits later” – I wonder what version control you use; this shouldn't be a major pain. It certainly isn't in Git, at least not after you've done it a couple of times and learned what to look out for... _and_ provided your history isn't clobbered with needless back-and-forth commits that give you conflicts every other merge. And commits that merely _comment something out_ are rather prone to do this... – leftaroundabout Jun 15 '17 at 19:45
@leftaroundabout yep, the theory is that it should be easy. And if you are the only one working on pet project of yours, sure, you will have an easier time. So if all your "and","and","and" checkboxes check, that's fantastic. More often than not though, life gets in the way ;) – Andrew Savinykh Jun 15 '17 at 20:51
4

@AndrewSavinykh yes, and it's not like I've not experienced these problems – projects whose maintainers never really worked _with_ the version control and put lots of info in comments that should really have been commit messages, rather added new manual-undo-commits instead of properly reverting, etc.. The resulting state of the repository made for lots of conflict fun in every larger rebase... But, and that's my point, these problems are often largely _caused_ by workarounds like the one you defend here. – leftaroundabout Jun 15 '17 at 21:09

score 15 · Answer 3 · edited Jun 16 '17 at 11:55

15

In my opinion both of your options are not best practice, as opposed to bad, practice:

Adding comments only adds any value if someone happens to read those comments.
Simply deleting from the VCS, (with a reason in the change description), may impact a critical deliverable and many people don't read the change description, especially when under pressure.

My favoured practice – largely due to having been bitten several times over the years, keeping in mind that you don't always know where or by whom your code is being used – is to:

Add a comment stating that it is a candidate for deletion and a deprecation #warning or similar, (most languages have some sort of such mechanism, e.g. in Java, in the worst case a print statement or similar will suffice), ideally with some sort of timescale and with contact details. This will alert anybody that is still actually using the code. These warnings are normally inside each function or class if such things exist.
After some time upgrade the warning to a standard file scope #warning (some people ignore deprecation warnings and some tool chains don't display such warnings by default).
Later replace the file scope #warning with a #error or equivalent - this will break the code at build time but can, if absolutely necessary be removed but the person removing it cannot ignore it. (Some developers don't read or address any warnings or have so many that they cannot separate out the important ones).
Finally mark the file(s), (on or after the due date), as deleted in the VCS.
If deleting a whole package/library/etc at the warning stage I would add a README.txt or README.html or similar with the information on when & why it is planned to get rid of the package and leave just that file, with a change to say when it was deleted, as the only content of the package for some time after removing the rest of the content.

One advantage of this approach is that some versioning systems (CVS, PVCS, etc.) will not remove an existing file on checkout if it has been deleted in the VCS. It also gives conscientious developers, those who fix all of their warnings, lots of time to address the issue or appeal the deletion. It also forces the remaining developers to at least look at the deletion notice and complain a lot.

Note that #warning/#error is a C/C++ mechanism that causes a warning/error followed by the provided text when the compiler encounters that code, java/java-doc has @Deprecated annotation to issue a warning on use & @deprecated to provide some information, etc. there are recipes to do similar in python & I have seen assert used to provide similar information to #error in the real world.

edited Jun 16 '17 at 11:55

Søren D. Ptæus

399
1
7

answered Jun 14 '17 at 18:46

Steve Barnes

5,270
1
16
18

7

In particular, since this question mentions Java, use the [`@deprecated` JavaDoc comment and the `@Deprecated` annotation](https://stackoverflow.com/q/5039723/1157100). – 200_success Jun 14 '17 at 23:17
8

This seems more appropriate when you want to get rid of something still in use. The warning can stick around until all usages are gone, but once the code is dead it should be deleted immediatly. – Andy Jun 15 '17 at 01:09
6

@Andy One of the problems with library type code, _especially in Open Source or in large organisations,_ is that you may _think_ that the code is dead but find that it is in active use in one, or more, places that you personally never imagined that it was in use in. Sometimes the code needs to be killed off, because it is really __bad__ or includes security risks, but you just don't know if & where it is being used. – Steve Barnes Jun 15 '17 at 05:08
1

@200_success thanks - my background in C/C++ showing - added to the answer. – Steve Barnes Jun 15 '17 at 05:43
1

@SteveBarnes Maybe but that's not what the OP is asking about. He's verified that the code is dead. – Andy Jun 15 '17 at 19:07
1

I think this is also highly dependent on the type of code base. Libraries, public api's, code bases with large numbers of developers, etc. might benefit from a more methodical approach while at the other end of the spectrum, for small teams with no external-facing code, just a commit with a message is enough IMHO. – xdhmoore Jun 16 '17 at 06:20
1

@Andy - You would be amazed at the number of times that I have checked with all of the people who are thought to be using the code, and they all thought that it was dead code and couldn't think of anybody who might be using it then changed or deleted it only to find out that one or more of: it was being used but had been forgotten about, it wasn't actually used but builds couldn't complete without it, another team (possibly overseas) was using it or depended on it, something tiny but essential was being used. The warning, error, remove cycle doesn't always help with decades long maintenance. – Steve Barnes Jun 17 '17 at 12:07
@SteveBarnes The OP didn't say he talked to anyone; it sounds to me like he has access to the entire codebase, and he's verified its not being used. It doesn't sound like there are any other teams, let alone overseas developers. If I got the impression that any of that was true, your advice is sound, but it doesn't sound like the case here. – Andy Jun 17 '17 at 16:49
1

@Andy The question was about __best__ practice and then carried on to say "for example..." I have come across many cases where developers that were not a part of a specific team were utilising code from that teams repository in code that the original team knew nothing about. Very few of the VCSs that I have used over the years gave any way of checking __who__ was checking code _out_ of the repository, even ones like PVCS only let you see who has checked out for edit. Personally, I find the above well worth the effort and it has highlighted where code that I thought was dead was still kicking. – Steve Barnes Jun 17 '17 at 17:11
@SteveBarnes He did not ask for a best practice, he asked if what he is doing is a BAD practice. He's looking for advice specific to his situation; the premise of his question is that his coworkers are complaining about his approach. You may find your advice worth the effort, but in the OPs situation, I'd say its not, even if it is considered best practice. "Best practice" certainly does not mean "always apply," and may best practices aren't appropriate for very small teams because the benefits are outweighed by the costs, which is the case with your answer, IMO. – Andy Jun 17 '17 at 18:04

TheHans255 · Answer 4 · 2017-06-15T16:39:07.070

Yes, I would say it's bad practice, but not because it's in the files versus in the commit message. The problem is that you are trying to make a change without communicating with your team.

Whenever you make any change to the trunk - adding, deleting, or modifying files in any way - you should, before you commit back to the trunk, talk about those changes directly with a member (or members) of the team who would be directly knowledgeable about them (essentially, discuss what you would put in those headers directly with your teammates). This will ensure that (1) the code you're deleting really does need to be deleted, and that (2) your team will be much more likely to know that the code is deleted and thus not try revert your deletions. Not to mention that you'll also get bug detection and the like for the code you add.

Of course, when you do change big stuff, you should also put it in the commit message, but because you've already discussed it, it doesn't need to be the source of important announcements. Steve Barnes's answer also addresses some good strategies to use if the potential pool of users for your code is too large (e.g. using your language's Deprecated markers instead of deleting the files at first).

If you don't want to go that long without committing (i.e. your change makes the most sense as several revisions), it's a good idea to create a branch from trunk and commit to the branch, then merge the branch back into trunk when the change is ready. That way, the VCS is still backing up the file for you, but your changes are not affecting the trunk in any way.

score 1 · Answer 5 · answered Jun 16 '17 at 18:59

A related issue from projects that build on multiple platforms and configurations. Just because I determine that it’s dead doesn’t mean it's a correct determination. Note that dead in use might still mean vestigial dependence that still needs to be weeded out, and some may have been missed.

So (in a C++ project) I added

#error this is not as dead as I thought it was

And checked that in. Wait for it to get through the rebuild of all normal platforms and that nobody complains about it. If someone did find it, it would be easy to remove one line as opposed to being bewildered with a missing file.

I agree in principle that the comments you mention are duplicating the feature that should be done in the versioning management system. But, you may have specific reasons to suppliment that. Not knowing the VCS tool is not one of them.

score -1 · Answer 6 · answered Jun 18 '17 at 12:36

On the continuum of Bad Practices, this is quite minor. I will do this on occasion, when I want to check out a branch and be able to see the old code. This can ease comparison to the replacement code. I usually get a code review comment "cruft". With a little explanation I pass code review.

But this code shouldn't live for long. I generally delete at the beginning of the next sprint.

If you have co-workers who do not read commit messages or won't ask you why you left the code in the project, then your issue isn't one of bad coding style. You have a different problem.

this doesn't seem to offer anything substantial over points made and explained in prior 6 answers — gnat, Jun 18 '17 at 12:50

score -3 · Answer 7 · answered Jun 15 '17 at 03:34

-3

you can also refactor the class and rename it with a prefix to UNUSED_Classname before you pull your stunt. Then you should directly commit the delete operation too

Step 1: rename class, add your comment, commit
Step 2: delete file and commit

If it's dead code, delete it. Anybody needs it, they can go back, but the rename step will prevent them from using it directly without thinking.

Also teach people how to look at all commit messages for a file, some people don't even know that is possible.

answered Jun 15 '17 at 03:34

user275398

11

6

"Refactoring" doesn't really mean what you think it does – nicholaswmin Jun 15 '17 at 04:36
6

There's no need to rename the class/method to ensure it's unused, deleting it works just as well. Instead the procedure is the same as any other change: 1) *delete the dead code*, 2) ***run the tests***, 3) *if they pass, commit with commentary*. – Schwern Jun 15 '17 at 06:34
1

@user275398 I also think renaming the files is really too much. And also from technological point of view, SVN treats renaming like the file would is deleted and created with a new name, thus you have a break in the history then. If you would restore the renamed file and look on the SVN log of it, the history prior to the renaming is not available. For this you have to revert then the file with the old name. Refactoring is by the way something else, refactoring is organizing the existing code into a new structure. – Bruder Lustig Jun 15 '17 at 11:45
@BruderLustig: Good software certainly won't do that. Git has `git mv`, which tracks history. Mercurial likewise has `hg mv`. [Looks like `svn move` should work for SVN, too?](https://stackoverflow.com/questions/108567/moving-directories-with-history) – wchargin Jun 16 '17 at 07:27
@wchargin No, `git mv` just moves the file and stages a file deletion and a file creation. All the move tracking happens when you run `git log --follow` and similar. `git` has no way of tracking renames, it actually does not track *changes* at all; it instead tracks the states at the time of commit and finds out what has changed at the time when you inspect the history. – maaartinus Jun 18 '17 at 07:58

Is it bad practice to not delete redundant files right away from VCS but instead mark them as "To be deleted" with comments first?

Why doing this instead of deleting them right away?

7 Answers7