I am currently writing an application with the following structure:
input: the equivalent of an excel workbook, i.e. a few tables with different headers and a few scalar values. They represent properties of hardware, workers, production processes,ecc.. in a factory and a few "global" parameters.
processing: compute some plan for the factory
output: the plan, i.e. a kind of schedule, in the form of a few table-like data structures, plus some KPIs
The current application is implemented in python. Other times I have used C++/C#/Java, R....
My problem is with the input and the output steps. It seems they have to be done over and over for most applications I encounter and that gets boring.
What I usually do is create a few data structures in memory to represent the input and the output. It could be a few classes each representing a row of a table, so to speak, or a C# datatable/pandas dataframe/dict-of-dict .
What I mean by "boring" is that I need to specify both the input and output logic at least three times each: one in the file, one in the code and one in the documentation.
For example, for a CSV file I have decide the number,position and name of the columns and write a template or example of the file. Then write and document some code to parse or write it and, finally, I have to write a "guide" that documents how the program handles the data, in particular missing or invalid values.
Of course, when the data format changes, the code has to change too.
I realize that in a larger team this would not be a problem as different people would handle different tasks. But currently it's mostly me.
Is there some practice to abstract and speed-up the process of translating the input and output format respectively to and from memory data-structure? Or some pointers to further sources that could help me better understand the opportunities and pitfalls in implementing those procedures?
Finally, I would also appreciate if you could point me to some useful libraries for that in either python, java or C#.