What Design to choose for Parsing different files to populate different classes?

Question

I am currently working on project which requires parsing different types of files. The content of the files will populate the classes.

eg: file1 populate content of class1,  file2 populate content of class2, etc.

My question is where should the parsing logic of the file go?

1st option : If I place the parsing logic inside the class then the class becomes really huge, because currently the class itself is ~400 lines and will grow in the future. The advantage of this way is things are really simple, I could use operator>>, etc.

2nd option : Create a new class that takes std::istream and class reference, do the parsing logic there and populate the class. The advantage of this way of doing is that if the class remains the same but formatting of the file changes,

eg: `file1 populate content of class1 or file1-v2 populate content of class1`

I have to use polymorphism to change the parsing logic and keep the design simple. Disadvantage of this option is, it's a little complex but I can live with it.

Which option do you think is better or is there any other better option?

score 9 · Accepted Answer · answered Oct 17 '17 at 01:28

9

Classes should have a single responsibility. Presumably the responsibility of parsing a file to populate a class and the responsibilities of class itself are distinct and should not be combined. Therefore the 2nd option is the better design.

It's pretty common for file formats to change as a system develops, and by separating this concern now, you'll be in a good place if the file format or the class you're populating changes. It's also a design that allows greater flexibility--e.g. if you decide you want to populate your class from the network, a different file format, etc.

answered Oct 17 '17 at 01:28

Samuel

9,137
1
25
42

make sense, thank you for pointing out single responsibility. I will go by option2. (just a side thought) My other thought was saying that parsing is part of operator>> so it should be part of the same class. How do you thing I should argue with that thought? – vanta mula Oct 17 '17 at 01:48
1

It's often a bit vague in my opinion what a single responsibility represents, but it may be helpful to think of it as "There should never be more than one reason for a class to change". See https://softwareengineering.stackexchange.com/questions/17100/clarify-the-single-responsibility-principle. I'm not familiar enough with the `>>` operator or your implementation to give very useful advice, but if `>>` isn't part of the classes responsibility either then I think it should be extracted (perhaps the implementation of `>>` would go in the parser class you plan to create in the 2nd option) – Samuel Oct 17 '17 at 02:46

Doc Brown · Answer 2 · 2017-10-17T13:12:03.737

9

If the parsing is more complex than a few lines of code, it is probably better to put it into separate classes, lets call them Class1Parser, Class2Parser and so on. If those classes contain similar parts, you can either

create a common base class BaseParser for both where the reusable parts live
create a separate utility class ParserUtil for reusable functions
make the parser a generic class like Parser<T> where T is either Class1 or Class2
use a combination of the former options (like making only the ParserUtil or BaseParser generic)

Note that an inheritance hierarchy of the parsers now can be independent from the inheritance hierarchy of the parsed classes, this would not quite be possible by putting the parsing logic directly into Class1/Class2.

None of these options should hold you back from implementing an operator>> for Class1 or Class2, if you think you need it for ease of use. This operator can simply delegate its functionality to an object of type Class1Parser or Class2Parser.

If these approaches end up becoming more complex than putting the parser logic into Class1 or Class2, then there is something utterly wrong in your code. In essence, it should actually be the same logic. By putting the parsing logic into separate classes, it stays decoupled from other methods in those classes, which should reduce the overall complexity, not increase it.

edited Oct 17 '17 at 13:12

answered Oct 17 '17 at 03:19

Doc Brown

199,015
33
367
565

1

You might even want to consider adding some form of header to the file which would be the same size for each type. Then you can read in the header, pass it to a factory which will return an implementation of your parser interface. This of course would require all your classes to implement some interface which the parser's parsing function can return – Hangman4358 Oct 17 '17 at 03:38
Firstly thank you for the detailed answer. Secondly when you say `make the parser a generic class like Parser where T is either Class1 or Class2` where do you think the reusable parts of code go if I choose this option? Did you mean it will go in Parser class? I asked this because if Parser is abstract class would this be a problem? – vanta mula Oct 17 '17 at 13:23
isn't that what `Template method pattern` all about? (in regards to what vanta mula asked in comment above) – solti Oct 17 '17 at 14:39
@vantamula: I try not speculating too much about your code - especially not about classes for which I don't know anything more than names "Class1", "Class2" etc. – Doc Brown Oct 17 '17 at 14:51

What Design to choose for Parsing different files to populate different classes?

2 Answers2