-1

The context of this question is the early stage of introducing a VCS into an academic setting consisting of non-SW-engineers, largely unaware of modern best practices related to coding as a team. At the time of introducing the VCS, many projects already had non-negligible amounts of code (serving as the initial commits in the respective repositories).

Team member X, who spent a significant amount of time implementing a complex mathematical algorithm, and is the only one who really knows what's going on inside it, is reluctant to adopt VCS and instead prefers to distribute periodic snapshots of everything they worked on in a certain period of time, and have somebody else push it to the VCS. Whoever ends up pushing it, due to their inevitable lack of understanding of the modifications, has no way to split them logically into smaller (atomic) commits, and no way of documenting the changes other than "Changes by X from period Y" - resulting in one giant commit that spans many files. This of course defeats much of the purpose of VCS, turning it into little more than a file storage service.

I believe that person X doesn't bother with the VCS not because of malice or an inability to learn, but because they don't see the added value in this process. I would therefore like to explain to this person the significance of introducing modifications in small and well-documented commits, in the hopes of getting them on board.

We don't have a large team, develop concurrently, use automation or anything but the latest version of the code. Therefore, many arguments in favor of VCS aren't really applicable in our case.

The best reasons I could come up with are:

  • One day, somebody else will need to maintain/modify this code. The external documentation (i.e. article/thesis) and internal documentation (i.e. comments in the code) may not explain why certain implementation details are the way they are (e.g. default values). If some line was changed, and the change was properly documented, this can help avoid repeating old mistakes.

  • Unless you accompany your codes with the exact commit messages that should appear in them, information might get "lost in translation".

  • You're needlessly creating work for another person.

  • One or more of the reasons given as answers here.

What other arguments, specific to our scenario, can I use?

Dev-iL
  • 233
  • 1
  • 10
  • Are they dealing with merge conflics that happen during "period Y" (+ time it takes to send to the victim committer)? – Caleth Apr 23 '20 at 11:40
  • 2
    see [How do I explain ${something} to ${someone}?](https://softwareengineering.meta.stackexchange.com/a/6630/31260) – gnat Apr 23 '20 at 11:44
  • @Caleth never happened so far. gnat - There's indeed resemblance to the format you linked. I know my question is very borderline in terms of "off-topic" and "opinion based", but I think at least the question in the subject is sufficiently unique. – Dev-iL Apr 23 '20 at 11:57
  • 2
    If they are "not software engineers", but academic mathematicians working alone, and whose code is frequently negligible, and whose code nobody else can understand anyway, then why should they be aware of "modern best practices related to coding as a team"? I wonder whether you aren't attempting to invent a case where none exists, because "good practices" (and the administrative burdens that accompany them) in one context are not necessarily good practices in all. – Steve Apr 23 '20 at 11:58
  • 3
    I wouldn't give arguments. I would just present them with a fait accompli: "We are not going to accept any more snapshots, you must commit any work yourself" If you want to be extra diplomatic, wait for there to be a merge conflic, and phrase it as "we have no idea how to resolve this" – Caleth Apr 23 '20 at 12:02
  • @Steve Thanks, that's a valid point about good practices being context-dependent! There's no reason why they should be aware of these practices, which is exactly why I'm trying to expose and educate them gradually. In a way, the whole point is turning the codes from being negligible and incomprehensible into something that could, as a first step, be used in ongoing projects in the lab (without requiring a PhD), and in the long term, advance scientific reproducibility (possibly being publicly released). – Dev-iL Apr 23 '20 at 12:07
  • @Caleth Wish I could do that :) Unfortunately, by the time a merge conflict occurs in this code, this person will be long gone. This is exactly one of the generally good arguments/approaches that is not really applicable in our setting :( – Dev-iL Apr 23 '20 at 12:11
  • @Caleth, unfortunately that depends on who has the power to impose. In an academic setting above all, there is a real risk of finding intelligent characters, and merely exclaiming "I am King!" is unlikely to wash with anybody. – Steve Apr 23 '20 at 12:32
  • 1
    Dev-iL, the problem you're describing suggests to me that the real problem is that you're giving software work to people who are not actually qualified by skill or experience to do it. Would you have your mathematicians, without a shred of exposure to engineering, build physical gearboxes out of metal to implement their algorithms? If you want reusable code, your department may have to retain an experienced programmer to work with the mathematicians to structure and document a proper algorithm, and impose the requirement of communicability. – Steve Apr 23 '20 at 12:42
  • @Steve (It's funny, because our lab actually deals, among else, with the design of gearboxes). The VCS equivalent of building gearboxes out of metal was never on the table. Most of the staff is expected to use the very basics of _commit, push and pull_ - so that's closer to playing with Lego blocks; where anything more complex will involve a lot of hand holding. I'm the closest available thing to an experienced programmer, and already doing pretty much what you just said. All I need now is to come up with the right motivation, which is proving difficult. – Dev-iL Apr 23 '20 at 12:58
  • Reluctance to use a VCS happens mostly when people work alone on a code base. Have two people work on the same code base, then make the VCS the dedicated tool of sharing the code changes between each other, then the problem will solve itself. (Of course, there is also value for a single dev in using a VCS, but that's much harder to communicate). – Doc Brown Apr 23 '20 at 13:07
  • 1
    Dev-iL, it's not the complexity of using the VCS which I'm describing as like gearbox manufacture, it's the complexity of writing software (a) when done as part of a team, and (b) which achieves reusability. The justifications for VCS arise naturally when writing reusable software as part of a team. The problem is your students aren't writing such software - they're writing their own code, alone, and you possibly regard the result as either trivially simple or else unfathomable. You can't justify VCS in those circumstances, because it adds no real value. – Steve Apr 23 '20 at 13:42
  • 1
    Or to put it another way, you've already said you don't have a development team, that the code is unmaintainable, no automation exists, and previous versions are never in use. Why is it that, in those circumstances, you remain convinced of the benefits of using VCS in the highly specific way you wish it to be used? – Steve Apr 23 '20 at 13:48
  • 1
    Which VCS? Git? If so, I have a feeling that they would *love it* once they realise what they can do with local branches, and how that experience is *incomparably better* then "periodic snapshots" - how it lets them come back to any point in time, do all kinds of experiments safely, and undo them easily, etc. – Filip Milovanović Apr 23 '20 at 13:56
  • @Steve The situation that you describe arose when the team was still small, and everybody was working on their own isolated projects for years at a time. Now the situation is different, with multiple people of different qualifications needing to make use of one of the codes. This seems like an appropriate opportunity to introduce VCS, in hope of gaining both the intrinsic benefits of VCS and as a byproduct, improve reusability and maintainability (I believe several people were now assigned to work closely with the original developer to achieve this goal), so we'll get there sooner or later... – Dev-iL Apr 23 '20 at 13:57
  • 1
    Also, I would give them reasons that would benefit *them personally*. Maybe have them do a couple of 1h-2h pair programming sessions (on some small task, or a toy project) with you or someone else who knows how to use the VCS, and have them see that it's not too complicated, but that it can provide value, e.g. it could help them understand their own code when they come back to it two weeks from now. Show them how to examine changes in a diff tool in the first session, and then in the later ones, how they can work with branches, etc. – Filip Milovanović Apr 23 '20 at 14:05
  • @FilipMilovanović Yes, it's Git. Do you feel like there are arguments that are specifically applicable to Git and not to other VCS? I didn't think it mattered, so I didn't mention it. I suppose "it can save you a lot of time when you're experimenting with different things" is statement that is correct regardless of VCS. – Dev-iL Apr 23 '20 at 14:06
  • "is statement that is correct regardless of VCS" - no, Git is especially well suited for this because it makes it very easy to create a branch, and they are implemented in an extremely lightweight manner. – Filip Milovanović Apr 23 '20 at 14:07
  • @FilipMilovanović Alright, I will look into this some more, thanks for the tip! (I personally only ever worked with hg and git, and in both cases only with graphical clients - this is why the difference wasn't very noticeable for me. But I might be able to leverage it as a "VCS system specifically chosen for its usefulness for experimentation of the sort we're doing on a daily basis!".) – Dev-iL Apr 23 '20 at 14:11
  • Dev-iL, I see. I don't think there's going to be a silver bullet for you - neither a decisive argument for VCS, nor any other especially useful advice. I would suggest being prepared to accept that there could be legitimate differences of opinion on the balance between administrative burden and the benefits of using a VCS in the style you wish. And bear in mind that true reusability and maintainability are properties of code produced industrially at great expense. Don't set standards excessively high - instead, accept that knowledge will be lost with the person. – Steve Apr 23 '20 at 14:44
  • 1
    In Git parlance, what you're describing here is a **Pull Request.** – Robert Harvey Apr 23 '20 at 14:59

2 Answers2

6

Personally I wouldn't even try to break a large chunk of code into lots of little commits. You're bound to end up with intermediate versions that don't work anyway. Just check the whole lot in in one go. The commit message isn't that important to most people; it's just documentation. What's more important is what VCS gives you:

  • A known latest version. If you just keep sending copies of the software to anybody you think is interested, pretty soon nobody will know what is the latest version, and they will end up using obsolete ones without knowing it.
  • An incremental backup system. If you do something stupid and delete your software, that's no problem. It's in the VCS. If you make a change that completely breaks things and you don't know why, and don't know how to repair it, just revert back to the previous working version. It's in the VCS.

It's being able to revert bad edits that is the big win for frequent incremental check-ins. There have been several times when I have made changes to a bit of code and broken it. But by the time I realize I have broken it, I don't know what to do to put it right again. And there have been times when I have mentally kicked myself for not checking in the last working version. You can waste hours removing all your edits, then re-implementing the changes you wanted to make.

Simon B
  • 9,167
  • 4
  • 26
  • 33
  • I think the part about various people using potentially obsolete versions is highly relevant. +1 – Dev-iL Apr 23 '20 at 13:04
  • 1
    Some of my commit messages span paragraphs... that documentation is important... – D. Ben Knoble Apr 23 '20 at 13:32
  • @D.BenKnoble I must admit that I tend to go the other way. My messages can be as terse as "Fix for ticket #123". If you want to know what the problem was, look at ticket #123. If you want to know what I changed, ask the VCS. – Simon B Apr 23 '20 at 14:56
  • It’s about the why, not the what. Diffs show the what, sure, but not why solution a over candidates b and c that i tried or thought about. I learn from commit messages in projects when im in unfamiliar code—having blame right in my editor makes this really easy and useful – D. Ben Knoble Apr 23 '20 at 16:14
0

Your problem is that you hired a mathematician who is not a software developer, and who is either unwilling or unable to follow decent software development processes. And since nobody can understand his code, to develop decent software. That’s the problem that you have and that you need to solve. And your management needs to realise that if he decides to leave you are in an awful mess.

What you could do to increase your company’s bus factor is to hire someone who is a decent mathematician and a good software developer and let them unravel the code, bit by bit. Making it clear to the original developer that if he tries to work against this he will be history. Let the new guy do code reviews, and if the code is not accepted, it needs changing. Pay the new guy more and tell the old guy. Explain to the old guy why he is paid less: Because he produces mathematics, but not value to the company.

PS. Whether he commits any snapshots himself doesn’t make any difference to your problem.

gnasher729
  • 42,090
  • 4
  • 59
  • 119
  • 1
    I stand to be corrected, but I gathered from the OP that the mathematicians concerned are *students* of the department, not paid employees. – Steve Apr 23 '20 at 13:58
  • 1
    @Steve Yes, that is true. This is why incentives through differential pay are not really an option for us. Otherwise this is a reasonable advice that could be worthwhile in another setting. I mildly disagree with the claim that it doesn't matter who commits the code, because I believe once the person goes through the process, they'll start seeing the benefits better. – Dev-iL Apr 23 '20 at 14:06
  • @Dev-iL No, the commit is just extra work that the guy doesn’t want to do. He won’t see benefits. He doesn’t care about yesterday’s code. He also doesn’t see that he’s a pain for everyone else. – gnasher729 Apr 24 '20 at 07:42