Would it be a bad idea to periodically run code formatters on a repository?

Question

I'm thinking of creating a cron job that checks out code, runs code formatters on it, and if anything changed, commits the changes and pushes them back.

Most projects that use autoformatters put them in a git hook, but doing it automatically every few hours would remove the burden for each dev to install the git hook.

I would still encourage everyone to write clean, well-formatted code, and maybe I can have the system automatically ping devs when code they wrote gets reformatted, so they know what to do in the future.

There's a logical error here. This method wouldn't encourage people to write well-formattted code; rather it would encourage people *not* to format and rely on the system instead! If your concern is that the code base is coherent, that's okay. It your goal is to train coders to write cleanly, it's a huge bug. — Kilian Foth, Apr 12 '17 at 06:24
Oh, I think my intention wasn't clear. I meant that I would personally encourage people to write well-formatted code, and suggested that I might even be able to have the "bot" notify people when their code isn't well formatted — bigblind, Apr 12 '17 at 06:27
It's also with considering the effect this will have on features such as blame - espsecially if the formatter makes a lot of changes, if most lines in the file are marked as being changed by the formatter, you loose some of the value. — Neil P, Apr 12 '17 at 08:38
I've had issues with an autoformatter not being updated properly and breaking my code. Something to keep in mind. — isaacg, Apr 12 '17 at 09:01
If it is that important to you, you could fail the build when code in not formatted correctly. That is a bit harsh but a proper peer review would more or less do the same thing. — Emond, Apr 12 '17 at 12:59
You can have automated tests on pull requests that include code format validation (and common mistake detection) so that people need to fix those before having their code reviewed — njzk2, Apr 12 '17 at 13:34
@KilianFoth - or you could have the reformatter make annoying code changes. Cut off the team's nose to spite their faces. — JeffO, Apr 12 '17 at 13:36
This is an excellent idea if your goal is to get people to hate you. It achieves very little, but will certainly cause unforeseen problems. — TonyK, Apr 12 '17 at 14:08
Auto-refactoring for style has gotten more powerful, but its not consistent enough yet to skip a human review of the output. Sometimes, some refactoring make things worse, visually. — Mark Rogers, Apr 12 '17 at 14:17
If an automatic code formatter existed that created the correct result every time, then we should be using it the other way around: store unformatted code in the repository and let each developer have their own code format that they prefer and format the code their preferred way on checkout. But, no: code formatting isn't just rules but also art. A code formatter can be useful, but the results always need checking by a human. — Colin 't Hart, Apr 12 '17 at 16:24
We had this at our company for and it worked well when we were a small team, up to ~15 devs devided among 3 projects that merged only occasionally between each other. As we grew, we had to drop it because of the merge conflicts this caused. Something to keep in mind. — Teimpz, Apr 12 '17 at 16:27
Isn't this what `pre-commit` / `post-receive` (format / validate) hooks are for? Or the equivalent in your VCS of choice — Mark K Cowan, Apr 12 '17 at 17:13
One suggestion I like (but haven’t been on a project that tried) is to reformat the code when you start on a new version. — Davislor, Apr 12 '17 at 21:15
There's also the option to deny merges with diffs that break style guidelines. It's much easier to detect bad style than correct it, as it's not easy to get 100% accuracy. — Tyzoid, Apr 13 '17 at 02:18
@KilianFoth I've seen a lot of developers in our project don't format their code properly because they said that it'll automatically done when doing a git push. And I also see many cases when the formatter makes a properly formatted code with macros or comments into a mess and there's no way to fix it — phuclv, Apr 14 '17 at 03:26
Yes. Sometimes code is formatted 'against' the rules in order to be more clear. For example it may align subexpressions to match one another or if API is accepting n×m matrix as single list code may actually format it as n×m arguments even if the API does not see it that way for whatever reason. When I was at university I had to undo my colleague autoformatter change as it made code unreadable - even if it was technically formatted correctly and AST did not changed. — Maciej Piechotka, Apr 14 '17 at 04:58
At least support formatter:off formatter:on annotations to help with some code that may break or become less readable. These tags are supported in both eclipse and Intellij Idea. See http://stackoverflow.com/questions/1820908/how-to-turn-off-the-eclipse-code-formatter-for-certain-sections-of-java-code — Borjab, Apr 17 '17 at 14:04
It might not be a bad idea to do something like this as a step in merging changes into master. You could see if running the formatter changed anything and automatically put up a warning/comment in the code review, if you want code review to consider formatting. This would only really work if you were diligent in using formatter:on/off annotations when you need to handle formatting yourself, though, as otherwise it would get cluttered with past manual formatting — Justin, Nov 23 '17 at 00:06

score 132 · Accepted Answer · edited Apr 12 '17 at 12:34

132

Sounds nice, but I would prefer to have people responsible for committing code changes, not bots.

Besides, you want to make absolutely sure that those changes do not break anything. For example, we have a rule that orders properties and methods alphabetically. This can have an impact on functionality, for example with the order of data and methods in WSDL files of WCF contracts.

edited Apr 12 '17 at 12:34

Glorfindel

3,137
6
25
33

answered Apr 12 '17 at 06:43

Marcel

3,092
3
18
19

51

Also it kinda breaks the VCS. You won't be able to know who wrote a line easily with blame, and if there is a conflict, your VCS won't be able to know the tool's changes are just for style and should be discarded everytime. – Pierre.Sassoulas Apr 12 '17 at 10:13
2

A tangentially related topic is enforcing certain "save actions" in an IDE like making fields final if possible. It's an issue because (In Java at least) a lot of frameworks set them through reflection (e.g., Spring and Hibernate). – Captain Man Apr 12 '17 at 13:25
@Pierre.Sassoulas - I agree with you if the goal is to get all members of the team to take pride in their code instead of finding blame (CYA). Everyone wants to build things they're proud of, so instead of saving the world working with bleeding-edge technology, we should concern ourselves with quality code. – JeffO Apr 12 '17 at 13:41
20

@JeffO `git blame` isn't just to find whose fault a bug is, it's to find the commit when something was added/removed, which can be useful for a number of reasons. – Pokechu22 Apr 12 '17 at 16:12
@Pokechu22 - I couldn't agree with you more. – JeffO Apr 12 '17 at 20:42
I've worked on projects whose management discouraged *humans* from reformatting large chunks of code, for exactly these reasons. – Camille Goudeseune Apr 12 '17 at 20:49
2

"we have a rule that orders properties and methods alphabetically" Boy, that sounds awful! You would have to constantly be hoping around all over the file – Alexander Apr 14 '17 at 21:47
@Alexander: It really depends on the individual code base, just like most rules of style. – Dietrich Epp Apr 15 '17 at 02:24
@Alexander A good IDE can show you the defintion of the symbol under the cursor. This helps quite a lot. – Thorbjørn Ravn Andersen Apr 15 '17 at 11:58
1

Also for Java the combination of a style checker which looks for derivations from the agreed upon style (perhaps even making it a compilation warning), and an IDE configured for formatting the agreed upon style makes it very easy to detect and fix. It is important that code settles (for lack of a better word) when it actually changes functionality - not when a bot reformats it. – Thorbjørn Ravn Andersen Apr 15 '17 at 12:01
@Alexander-ReinstateMonica I took the rule as a lesson, that order should not matter to my mental model, better useing code navigation, discard the "region" concept and use partial classes where appropriate. In the end, it's a good thing, helping use the "compare" feature when changing files. – Marcel Nov 20 '20 at 07:06

Doc Brown · Answer 2 · 2017-04-13T08:50:47.870

75

I would instead try to make it really easy for everyone in the team to apply automatic code formatting according to your team's standard directly inside your editor or IDE, to the current source code file (or selected parts of it). This gives your team members more control about how and when the formatting takes place, let them inspect the code before it is committed in the final form, and test it after the formatting took place, not before.

If all or most of your team members use the same editor, this should not be too hard. If everyone uses a different one, then your approach might be the second best solution, as long as the team supports this. However, I recommend you have extra safety measures installed, like nightly builds and automated tests which are run after each automatic code modification.

edited Apr 13 '17 at 08:50

answered Apr 12 '17 at 06:46

Doc Brown

199,015
33
367
565

9

"A formatter at hand where you are 100% sure it does not break anything" - When have you ever run into a piece of *code* that you were, honestly, 100% sure would never break? – Zibbobz Apr 12 '17 at 13:29
2

@Zibbobz: any of the tools we are using to edit code, to compile and link it, and also the VCS, can have bugs, still that does not stop us trying to develop working programs ;-) But see my edit. – Doc Brown Apr 12 '17 at 14:07
I do approve of the edit - if only because an extra ounce of caution is worth a pound of debugging. ;) – Zibbobz Apr 12 '17 at 14:15
@Zibbobz Quite often! Note that being 100% sure of something doesn't mean it has a 100% chance of being true. – user253751 Apr 15 '17 at 11:45
@immibis: you may have missed that the comment referred to a sentence I removed from my answer some days ago. – Doc Brown Apr 15 '17 at 13:07

score 37 · Answer 3 · edited Apr 12 '17 at 16:41

37

It's a bad idea, not just because it discourages people from writing decent code, but also because the reformatting will show up as code changes in your VCS (you do use one I hope), masking the historical flow of the code's development. Even worse, every code formatting action (indeed every change to the code at all) has the possibility to introduce bugs, whether it's manual or automated. So your formatter can now introduce bugs in your code that aren't going to go through code reviews, unit testing, integration testing, etc. until possibly months later.

edited Apr 12 '17 at 16:41

donjuedo

123
3

answered Apr 12 '17 at 07:13

jwenting

9,783
3
28
45

10

I think if he's talking about having a bot check things in and out of a repository, VCS is a given? – Weaver Apr 12 '17 at 07:35
11

@StarWeaver you'd think. But I've worked in places where the "code repository" was a shielded directory only accessible by a piece of software running under its own user account, no version control at all except a weekly backup of the directory under a timestamped directory name. – jwenting Apr 12 '17 at 07:40
14

That's … I'm going to go have nightmares now >.> – Weaver Apr 12 '17 at 08:00
7

@StarWeaver why do you think I still remember that environment 14 years later ;) – jwenting Apr 12 '17 at 13:47

score 29 · Answer 4 · edited Apr 28 '17 at 15:21

29

I would tend to believe it is a good idea (to automatically run code formatters), but that is just my opinion.

I won't run them periodically, but if possible before version control commits.

With git, a pre-commit hook doing that would be useful. In many C or C++ projects built with some Makefile, I am adding some indent target (which run suitably code formatters like indent or astyle) and expect contributors to run make indent regularly. BTW, you can even add some make rules to ensure that the git hooks have been installed (or to install them).

But really, it is more a social issue than a technical one. You want your team to commit clean and nicely formatted code, and that is a social rule of your project. (There is not always a technical answer to every social problem).

Version control is mostly some tool to help communication between human developers (including your own self a few months from now). Your software don't need VC or formatting, but your team does.

BTW, different communities and different programming languages have different views on code formatting. For example, Go has only one code formatting style, but C or C++ have a lot of them.

edited Apr 28 '17 at 15:21

jskroch

109
3

answered Apr 12 '17 at 06:30

Basile Starynkevitch

32,434
6
84
125

Right, but that means every developer needs to install the commit hook. – bigblind Apr 12 '17 at 06:31
17

Yes, but that is a social rule, and you need several of them. You cannot find a technical solution to every social problem. You need to convince people. Also, every developer need to install some compiler, and the version control system itself. – Basile Starynkevitch Apr 12 '17 at 06:32
Right, so you're saying the social rule of having to install a git hook, is better than an automated system that does it? (serious question, not meant as a rhetorical one, tone is hard to convey on the internet :)) – bigblind Apr 12 '17 at 06:35
15

Yes, I am saying that. – Basile Starynkevitch Apr 12 '17 at 06:37

Karl Nicoll · Answer 5 · 2017-04-12T14:23:29.603

In general, I think it is a bad idea. In principle it's a valid idea, but it can be problematic in reality. Having the code formatter break your code is a real possibility, and it only takes one formatting run to have your developers respond with (probably justified) hostility (e.g. "Your lousy code formatter broke the build, turn it off now!").

In the same vein as @BasileStarynkevitch's recommendation, we use git server-side post-receive hooks to send "advisory emails" about code style.

If I push a commit that contains style violations, the git origin server will send me an email that informs me that I broke the style guidelines and recommends that I fix my code. It does not, however, enforce this because there may be valid reasons for breaking the house style (long strings exceeding line length limit, for example).

If it's a systemic problem that is harming the codebase, it might be time to start bringing up code style issues in code reviews. Poor code style may mask bugs and make the code harder to read, so it can be a valid code review issue.

To add to the "social problem" aspect of things, it can be worthwhile encouraging people to fix cosmetic and stylistic defects as they find them. We have a standard commit message "Cosmetic." for code style fixes that other developers know do not contain breaking changes.

As @DocBrown says, another option is to enforce code style within your IDE. We use CodeMaid with Visual Studio to correct many common style errors. It will run upon saving the code files, meaning that bad-style code should never even make it into the repo... in theory :-).

I upvoted this one, because it directly addresses both sides (desire for better code + safety from breaking the build). _After_ the code base is in really good shape, though, I'd consider "bad idea" to be somewhat strong wording for automating future changes. — donjuedo, Apr 12 '17 at 14:19

score 17 · Answer 6 · answered Apr 12 '17 at 13:23

17

I think it's a bad idea. Many of the answers already covered that it dirties the history by making it difficult to pin down who actually added a line and that it encourages people to just commit whatever and the format-bot will handle it.

A better approach would be to incorporate a format checker into your build tool. (In Java there is Checkstyle) Then, only allow people to merge their branches to the main branch if the build passes (including the formatting).

If you allow people to commit straight to the main branch (like in Subversion for example) then you will still need to make sure everyone has the discipline to commit only formatted code (or have the server only accept the commits once some checks have been run).

answered Apr 12 '17 at 13:23

Captain Man

586
5
16

2

but do make sure the style checker is sane and doesn't enforce weird things that go counter to any accepted (and acceptable) practice (and yes, I've seen such). You can then also have the build server have the automatic build fail when the style checker throws (new) errors. – jwenting Apr 14 '17 at 06:13
2

I have seen many cases where automatic formatting produces unreadable code. Especially for lots of closing parens (it is sensible to leave some of them on separate lines, but the robot does not care). Another example is automatic splitting of long Java strings (they are long because they contain SQL queries, but the robot has no idea.). – 18446744073709551615 Apr 14 '17 at 17:28
@18446744073709551615 I'm not suggesting automatic formatting, I'm suggesting automatic *checking*. Your rules could allow those things. – Captain Man Apr 14 '17 at 21:27

score 10 · Answer 7 · answered Apr 12 '17 at 12:23

Yes, I think it's a bad idea. Don't get me wrong, the reason to do it sounds great, but the result could still be horrible.

You will have merge conflicts when pulling a tracked branch, atleast I fear that would be the case, I might be wrong though.

I don't want to test it right now at work, but you should try it out yourself.

In fact you can just check out a recent commit. Make a new branch, commit something petty, cherry pick or merge with no autocommit.

Then run your script, pull and if your result is a horrible merge mess, then you should definitly not do this, at daylight.

Instead you could potentially put it into a nightly build or a weekly build.

But even a nightly might be a bad idea.

You could either run it weekly, when you are sure no merge conflicts will arise because everything is finished on monday.

Otherwise run it 1-2 times a year on holiday season, when merge conflicts will not occur.

But the solution might depend on your priority for code style.

I think that making a setup script that automatically creates the git repository and sets the hooks for the project would be better.

Or you might include the hook setup script in a folder for your devs within the project and simply check it into git itself.

gnasher729 · Answer 8 · 2017-04-13T18:30:55.817

It’s an awful idea.

If one of my developer colleagues made gratuitous changes to source files, it wouldn’t pass a code review. It just makes life harder for everyone. Change the code that needs changing, nothing else. Pointless changes lead to merge conflicts, which can lead to errors, and just create pointless work.

If you want to do this on a regular basis, that’s just awful.

And then there is the question of what kind of changes the code formatter makes. I use automatic formatting in my editor, it works reasonably well, and I can improve things when the automatic formatting is less than perfect. If you use a code formatter that goes beyond that, you are not going to improve the code, you are going to make it worse.

And then there is the social problem. There are people who want to force everyone to use their code style, and there are more flexible people. Something like this would likely be suggested by the "grammer-nazi" (spelling intentional) type of developer who want to force their style on everyone else. Expect a backlash, and expect the flexible, normally easy-going developers to put their foot down.

score 7 · Answer 9 · answered Apr 13 '17 at 18:21

7

Something I haven't seen mentioned is the fact that sometimes there are legitimate reasons to not format something according to a set of rules. Sometimes the clarity of the code is improved by going against a given guideline that 99% of the time makes sense. Humans need to make that call. In those cases, automatic code formatting would wind up making things less readable.

answered Apr 13 '17 at 18:21

Andy Lester

4,810
2
16
23

Totally agree. For example, aligning variable identifiers on the left, all the equal signs in a single column, and all the values to the right of equal signs. A formatter would depreciate readability in this case by reducing the tabs/spaces each to a single space. Of course formatter rules could be written to prevent this but then you need a developer assigned just to writing the code formatter. Seems like a bad idea all around. Adopt better in-house conventions and enforce during code review. – ThisClark Apr 15 '17 at 03:27

score 4 · Answer 10 · answered Apr 12 '17 at 08:00

You don't mention the VCS you use but depending on that another option is to have a server-side hook. A VCS like git supports that. You could install a sever-side hook that runs the formatter on the version being pushed and then compares the formatted file against the version that is being pushed. If they differ the developer did not use the correct formatting and the server could reject the push. This would force your devs to only push code with the desired formatting thus encouraging them to write clean code from the beginning, it would make the devs responsible for having tested the correctly formatted code and relieve your devs from having to manually install a client-side hook.

This is what Go does, for example, except using Mercurial. Pushing something that is not invariant under `go fmt` is automatically rejected. — Jörg W Mittag, Apr 12 '17 at 08:39

score 1 · Answer 11 · answered Apr 12 '17 at 12:51

It's a trade-off between a cleaner code format and more exact and easier to understand git history. Depends on nature of your project and how often do you dive in git history or blame to understand what's going on. If you're working on something new and don't have to maintain backwards compatibility, history usually doesn't play as important role.

score 1 · Answer 12 · answered Apr 12 '17 at 22:28

This idea is similar to some other answers, but I cannot comment my suggestions.

One option is to set an alias (or hook, or whatever) to the commit function, that runs the code formatter on the to-be-committed code before it is committed.

It could have 2 (or more) results:

1) Show the user the proposed changes and ask for their approval to apply and commit the changes.

2) Ignore the proposed changes and commit the original code.

You could also add more flexibility to these options, like the ability to edit the changes proposed in option 1. Another idea (Depending on how hard you want to push these coding standards) is to have the system send you a report of some sort when option 2 is selected.

This may be a good balance of auto-checking all code like you want, while still allowing flexibility to the developers to suit their needs. It also allows the option to not 'auto reject' code with formatting differences as mentioned in other answers. With the 'I have reviewed and approve automatic formatting corrections; commit' option, it still keeps personal accountability for each developers work and doesn't mess with VCS.

score 0 · Answer 13 · answered Apr 28 '17 at 17:11

I won't do it on the repository but I will do it on save if the tool supports it. Eclipse is one and in addition I would do the code cleanup including sorting.

The nice part is that it is part of the project so every developer will get it for their project.

As an added bonus merges would be significantly simplified since things will not be jumping around.

Code reviews will prevent any errant ones.

Another place I would do it is have it part of the build. In my case I have it such that maven builds will reformat the XML and clean up pom files and reformat the code.

That way when developers are ready to push it will all be cleaned up for their pull request.

Would it be a bad idea to periodically run code formatters on a repository?

13 Answers13