50

I'm developing a .Net application that uses google protocol buffers. Historically the application used the approach, advocated by the protobuf-net team, of decorating the classes with attributes instead of using .proto files.

I am now in the process of migrating part of the application's client to another technology and there is a strong desire to start using the .proto files as the authority so that the two technologies can inter-operate.

My plan is to automatically generate the C# from the .proto, however, my question is should I check the resulting files back into source control? Note I expect the process of code generation to be fast.

How do I choose appropriate approach for above case? What is considered the best practice?

Dave Hillier
  • 3,940
  • 1
  • 25
  • 37

5 Answers5

66

As a general rule, generated files do not belong in the source code repository.

The biggest risk you run when you do put those files in the repository is that they become out of sync with their source and the build runs with different protocol buffer files than you would think based on the .proto files.

A few reasons for deviating from the general rule are

  • Your build environment can't handle the additional build step automatically. If you have to generate the files by hand for every build anyway, you might as well put them in the VCS. Then it actually reduces the risk of a version mismatch due to fewer people having to do the generation step.
  • The generation step significantly slows down your build
  • After generating them, the files are further modified by hand. In that case, they are not really generated files any more and thus belong in the VCS.
  • The source files change very rarely (e.g. they come from a third party that only provides updates every few months or so).
Robert Harvey
  • 198,589
  • 55
  • 464
  • 673
Bart van Ingen Schenau
  • 71,712
  • 20
  • 110
  • 179
  • 2
    for point 3 you can check if patch files are sufficient to keep things up to date – ratchet freak Mar 27 '13 at 13:37
  • 2
    I would like to add that having proper configuration management of your build environment is critical. Just like binaries, generated code depends on both the input files *and* the exact tools used to create it. – sourcenouveau Mar 27 '13 at 13:48
  • 1
    The one-step-build criteria is the critical factor here. Without that, there is too much room for error for being able to tag a set of files as a particular build. If you can't do this, then going back in time to figure our when an error crept is that much harder. – mpdonadio Mar 27 '13 at 18:18
23

First and foremost, version control software is there to help you do your job. So, the first question you ask shouldn't be "should I put X in version control?" but rather "will putting X in version control help me?". Often those two have the same answer, but not always.

In other words, instead of asking us whether we think these files should be in version control or not, ask yourself whether doing so is useful to you. If generating the files is fast, what value is there in saving the generated data? Are your build scripts designed such that they always generate the files, or do they first look to see if something is on disk?

You also have to ask yourself whether it's important to have a concrete record of what went into a build. If you need to precisely reproduce a build, there may be value in having those files in source control. It's conceivable that a future build may use a newer version of the tool which generates slightly different files -- though another option is to also version-control the tools themselves.

The bottom line is, don't worry so much about what you should or shouldn't do based on somebody else's ideas of a standard, do what's best for your project. Only you can answer that since we don't know all of the constraints and requirements you are dealing with.

Bryan Oakley
  • 25,192
  • 5
  • 64
  • 89
  • I try to use the best practices directly. If you take a best practice, and change it, then it is no longer a best practice. I also see value in standardization - although I try to avoid red tape. – Dave Hillier Mar 27 '13 at 14:10
  • 4
    +1 newer version of the tool generates slightly different files – Mr.Mindor Mar 27 '13 at 14:51
  • 2
    @DaveHillier Best practices are, by definition, general rules. They always need to be considered in context. In other words, they are generally, but not always, the right way to do things. Follow the best practices unless you can clearly articulate and defend doing something different. – KeithB Mar 27 '13 at 21:09
8

A lot of files in .net are automatically generated by Visual Studio; nevertheless, we usually check these in so that anyone checking out the source code has a complete version that builds and runs. (See, for example, classes made from xsd files for deserializing xml.)

I would view these files the same way; they're C# source code that was generated via a tool but which is necessary for the program to build. Therefore, check them in.

Yamikuronue
  • 702
  • 3
  • 13
  • 1
    While probably only relevant to IDEs, in that context I agree completely. There's not, afaik, an option in Visual Studio to run the generation tools when first opening the solution. As a result on a clean checkout the solution will be broken because all the generated files are missing. – Dan Is Fiddling By Firelight Mar 27 '13 at 17:48
6

It sounds like you're referring to what The Pragmatic Programmer calls passive code generation.

Passive code generators save typing. [...] Once the result is produced, it becomes a full-fledged source file in the project; it will be edited, compiled, and placed under source control just like any other file.

Uses:

  • Creating new source files
  • Performing one-off conversions among programming languages
  • Producing lookup tables and other resources that are expensive to compute at runtime

I would put the generated source files into source control in any of the following situations:

  • If you're planning to modify the generated C# code
  • If you want to avoid having to generate the code before running in dev environments? E.g. if a tool must be installed to do the generation
  • If the generation process is lengthy
Mike Partridge
  • 6,587
  • 1
  • 25
  • 39
5

If there will be no modification to generated files, there is no point in putting them in version control. You don't put compiled binaries in version control either, right? A basic rule with version control is, that you don't have to put anything in version control, that can be generated from other files. Of course if the generation is somehow complicated or takes a long time, it might not be practical.

simoraman
  • 2,318
  • 17
  • 17
  • 1
    "You don't put compiled binaries in version control either, right?" We put compiled binaries in version control. Many years we had the rule of thumb to not do it. In the menatime we seen that the build environment (also how well documentated it was) differs slightly and you cannot reproduce old binaries. One stupid example is Firefox - build is not reproducable for years. And i think a lot of people tried. So additional to source we are now used to check in "final" versions. – Offler Mar 05 '19 at 15:33