C# Source Generator : which way to go for a simple GitHub Community Project

Question

CONTEXT

My project is about implementing the "conventional method for house-energy-efficiency assessment" in France (120 pages). I want to provide a strict yet easy to use API to this method. At first, the audience will be engineers that are not programmers but have basic knowledge in C# to consume such an API inside the Rhino & Grasshopper software.

I feel I should resort on some kind of code generation for this project :

The method relies on lots and lots of predefined data tables : weather, materials, equipment efficiency factors, ...
The method is likely to be updated from time to time (say on a yearly basis)

Architecture & Code Generation

Provide these data as .json files : one source of truth, easy to read and check for contributors, easy to serialize/deserialize.
Use these .json files to generate part of my API (less error prone, more flexibility at design time, yet very strict to consume by the client).
Maybe use these .json files later-on to produce markdown tables for - let's say - a médocs wiki on the method.

In the end, my goal is mainly to generate enums and double[] lookup tables (as static props) in partial class.

Already Tested | not satisfied

T4 Templates : too much tied to VS on windows (?)
C# Source Generator : the problem here is the workflow gets a lot more complex for a non-programmer community audience with a GitHub portal. Because of dynamic injection, the generated files are not really part of your source control (rather the generator project itself). Testing workflow also will be a lot more complex to understand.

Is this the right workflow ?

I feel in my situation It is better to go for a "2 steps" build workflow :

Have the main API OpenDPE.Core project defining my core architecture with all the required types (House.cs, Wall.cs, Floor.cs, Weather.cs ...) as partial class.
Have an additional OpenDPE.Core.Generators project to build all my enums and "inject" all my static prop look-up tables within partial classes (House.g.cs, Wall.g.cs, Floor.g.cs, Weather.cs ...). Maybe use Scriban templating engine here ?

This way, the main project is more a "classic" standalone project easy to git/test/share.

QUESTION

What do you think of this architecture ? Do you see some drawbacks ?
Is there a way to automate the workflow at debug time : that is build of main project will trigger first the generation of the my *.g.cs files from the generator project ? Does it makes sens ?

EDIT

I'll try to be more precise according to your remarks.

Main Ideas on Data(s)

@Greg : yes, this is probably more like a data access problem. The method holds lots of tabular data (about 100 tables).

I want to be very transparent about these data : so consumers can both easily check the validity of the data and can review it (on GitHub for instance).
However consumers of the API must not be able to change these values at runtime. If so, it can leads to "non-regulatory" results. (Maybe I should think of signing valid releases of the API.dll that are conform to the text of law / another topic).
I want these data to be exposed clearly in a human readable and effortless format. A bunch of .json files in a data folder sounds good to me. Anybody can go to the GitHub repo and eventually request a modification.
Might be useful to track revisions of these data (text of law) to compare their impact on the energy assessment.
Some of these data are static for sure. Some are "almost" static : they may be updated when the method itself is revised.

Example 1

I have this list of "departments". This is very static data, very unlikely to change. They are about 100 departments.

[
{
  "Code": "19",
  "Nom": "Corrèze",
  "ZoneClimatique": "H1c"
},
{
  "Code": "2A",
  "Nom": "Corse-du-Sud",
  "ZoneClimatique": "H3"
}, 
]

I was thinking of this class :

public enum NomDepartement
{
    // AUTO GENERATED CODE FROM JSON
    ...,
    Correze,
    Corse_du_Sud,
    ...
    // AUTO GENERATED CODE FROM JSON
}

public partial class Departement : IEquatable<Departement>
{
    private readonly static Departement[] _table = new Departement[]
    {
        // AUTO GENERATED CODE FROM JSON
        ...
        new Departement("Corrèze", "19", ZoneClimatique.H1c),
        new Departement("Cores-du-Sud", "2A", ZoneClimatique.H3),
        ...
        // AUTO GENERATED CODE FROM JSON
    }

    public string Nom { get; private set; }
    public string Code { get; private set; }
    public ZoneClimatique ZoneClimatique { get; private set; }

    public Departement(string nom, string code, ZoneClimatique zoneClimatique)
    {
        Nom = nom;
        Code = code;
        ZoneClimatique = zoneClimatique;
    }

    // Get dep by enum 
    public static Departement ParNom(NomDepartement nom)
    {
        var index = (int)nom;
        return _table[index];
    }

    // Get dep by code
    public static Departement ParCode(string code)
    {
        string search;
        int numero;

        if (Int32.TryParse(code, out numero))
            search = numero.ToString("00");
        else
            search = code;
        
        for (int i = 0; i < _table.Length; i++)
            if (_table[i].Code == search) return _table[i];
            
        throw new ArgumentException(string.Format("{0} n'est pas un code de département valide", code));
    }
}

And it sounds to me that generating this class through a consol project in VS is efficient. I can use MyClass.g.cs partial class to clearly separate the generated code ... or maybe not.

Do you think this is a bad way of doing it ?
What other type of data storage would be better ? (that meet the requirements in "main ideas"). Something like LiteDB ? .Json as embedded resources ?

I think it is pretty hard to give you any recommendations from this 100.000 feet perspective. I am actually not sure which problem you are trying to solve, or how your data flow will look like, or which problems you see with your C# Source Generator, or why you need such a generator at all. There seem so many details missing here which probably all exist in your head that I am unsure what to ask you for. — Doc Brown, Dec 15 '22 at 17:05
Why not load the data from JSON and process using some form of abstraction, rather than generating code? — Greg Burghardt, Dec 15 '22 at 22:46
To be honest, this sounds like a data access problem, not a code generation problem. But we need more information about the problem your are trying to solve. — Greg Burghardt, Dec 15 '22 at 22:48
And using static properties to hold data has the potential to become a glorified global variable. *That* might be the *real* problem to solve. — Greg Burghardt, Dec 15 '22 at 22:50
Have you considered simply publishing the raw data as a downloadable archive file? (For example, as a plain database dump/export, or an Excel spreadsheet). It seems strange to build an API for data that only changes once per year. — Ben Cottrell, Dec 17 '22 at 09:35
Ben is right, from what you wrote it is not clear why you not simply publish the data itself in some tabular form as a file for download. — Doc Brown, Dec 19 '22 at 11:04

score 0 · Answer 1 · answered Dec 21 '22 at 15:19

It sounds like an Object-Relational Mapping (ORM) package may take care of a lot of the work you're trying to do.

Code generation is a valid and well-travelled path, but it is difficult. You may find yourself getting stuck in the implementation details rather than moving towards a working solution to this problem. It's an interesting field, but it's a distraction from completing a working plugin. Plus, why rewrite something that's already written by someone whose life work it is to do that particular thing better than anything else?

Consider this different workflow:

Load your data into a database.
Run the ORM on the database.
Import the generated code files into your project.
Write your plugin, unit tests, etc.

This removes from your workload the problem of generating code and allows you to focus on things like integrating the generated code, testing and managing versions.

There are many good ORM packages. Entity Framework is probably the closest to the, "official," solution on the .NET platform. NHibernate is a very mature solution. I also found this list of the, "best ORM for C#," by doing a quick search, although there are many other lists like it.