3

Sometimes code will need to be moved. A common example is some logic that exists in a controller needs to get moved to a helper class so it can be called from outside of that controller. So someone else who didn't write that original logic will cut the code from the controller and paste it into their new class. When that happens, the git attribution now just shows that the new person wrote that code, and the original author and date are no longer attributed.

Currently, we make sure to leave a comment in the new class when this happens. Is there a better way to handle this situation?

Just to clarify: The issue isn't about copyright or anything like that, its all internal code for my company's private repo. The concern is just to have an accurate history in the repository.

chiliNUT
  • 181
  • 8

2 Answers2

2

Though Thomas Owens's answer is technically correct, I am not sure if it really addresses your question.

Yes, Git (and most other SCC systems) maintains information about the person who committed each line of code for the entire history of the project, but not who wrote each line of the code originally, or if code was cut-and-pasted from one file to another. This is actually not what Git is designed for.

The root cause here is the fact that the only thing Git sees from the source code is the textual content of those files at certain points in time, but not the editor commands (like cut, copy and paste, or just typing characters) which created that content. An SCCS which could track the "real" author of some source code blocks over their full "live time" automatically would have to do this, which means it had to be very deeply integrated into your text editor or IDE. I am not aware of any SCCS or IDE which does this (but maybe something like this exists somewhere, I honestly don't know).

Hence, if you think you really need to keep the history of original authors, your approach of leaving a comment where some piece of code was taken from is the most effective thing you can do. For most practical purposes, this is totally sufficient. Knowing the author of a piece of code sometimes may be of help, but in reality, this information is only seldom required. And for those rare occasions, it is usually sufficient to work oneself through the commit history and "reverse-engineer" from the commits who might have been the "real" author of a piece of code. In larger, older code bases, this often turns out to be be a person who isn't reachable nowadays, in which case the authors name would probably be useless anyway.

Doc Brown
  • 199,015
  • 33
  • 367
  • 565
  • I work for a small/midsize company, and turnover is low, so most of the time its more than likely that the author still works here – chiliNUT May 26 '22 at 02:10
1

Each commit is maintained in git's history. If you were to perform a diff between two commits, you would also be able to see who the last editor of each line was and what commit it was modified in. In other words, unless you start rewriting git's history, git maintains information about the person who committed each line of code for the entire history of the project. You would just need to navigate back through the commits to see it.

Thomas Owens
  • 79,623
  • 18
  • 192
  • 283
  • Plus, if the controller goes from `big() ball() of() mud()` to `helper.foo(...)` you will see in that commit that the big ball of mud was replaced by a single-line method call. Any reasonably experienced developer will be able to track down where the mud currently lives. – Greg Burghardt May 24 '22 at 17:37
  • Link a previous commit/ repo url in commit log message if it is a copy from old code. Earlier my company loved using shared components, which was easy to track to module/artifact. but now it is separate components in repos we do not have access to, so I always add a link to where I copied the code from in the first commit. I have a script that can create history logs based on commit logs and urls enclosed. – MortenB Jun 01 '22 at 08:15