When does the "Do One Thing" paradigm become harmful?

Question

For the sake of argument here's a sample function that prints contents of a given file line-by-line.

Version 1:

void printFile(const string & filePath) {
  fstream file(filePath, ios::in);
  string line;
  while (std::getline(file, line)) {
    cout << line << endl;
  }
}

I know it is recommended that functions do one thing at one level of abstraction. To me, though code above does pretty much one thing and is fairly atomic.

Some books (such as Robert C. Martin's Clean Code) seem to suggest breaking the above code into separate functions.

Version 2:

void printFile(const string & filePath) {
  fstream file(filePath, ios::in);
  printLines(file);
}

void printLines(fstream & file) {
  string line;
  while (std::getline(file, line)) {
    printLine(line);
  }
}

void printLine(const string & line) {
  cout << line << endl;
}

I understand what they want to achieve (open file / read lines / print line), but isn't it a bit of overkill?

The original version is simple and in some sense already does one thing - prints a file.

The second version will lead to a large number of really small functions which may be far less legible than the first version.

Wouldn't it be, in this case, better to have the code at one place?

At which point does the "Do One Thing" paradigm become harmful?

This kind of coding practice is always based on case by case. There is never a single approach. — iammilind, Nov 23 '11 at 08:07
The question's title is now very misleading given the accepted answer. Perhaps change it since you have not given space to discuss when "do one thing" is or is not justified? (and that question should be Community Wiki anyway...) — Alex Feinman, Nov 23 '11 at 15:55
@Alex - The accepted answer literally has nothing to do with the question. I find that really odd. — ChaosPandion, Nov 23 '11 at 15:59
@ChaosPandion `@alex - a whole bunch of folks missed the point of the original question. It was not about whether the sample code was the correct way to read a file, but rather when the "Do one thing" principle evangelised in Clean Code becomes harmful. The code example was just a way of getting the point across. — Kev, Nov 23 '11 at 18:12
I note that your refactored version is upside down, which contributes to a lack of readability. Reading down the file, you would expect to see `printFile`, `printLines`, and finally `printLine`. — Anthony Pegram, Nov 23 '11 at 19:01
I think that "functions should be short", when you have a function that is longer than 1 screen, then it probably does "more than just one thing". — Czarek Tomczak, Nov 23 '11 at 19:14
@AnthonyPegram - I don't think that really matters given how simple the example is, also the order of functions is dependant on your schooling. — Kev, Nov 23 '11 at 19:28
@Czarek - that's a metric that's as useful as counting lines of code produced each day by a developer. It all depends on how many lines your editor displays, font size, screen resolution. — Kev, Nov 23 '11 at 19:30
@Kev, I disagree, particularly in the context of the OP's referencing of Clean Code, which advises using the top-down technique mentioned, going from high level detail to low level. — Anthony Pegram, Nov 23 '11 at 21:08
@AnthonyPegram - maybe so, you're being a wee bit pedantic, the code isn't hard to understand at a glance and doesn't change the main jist of the question. — Kev, Nov 23 '11 at 21:14
@Kev, I once again can only disagree, particularly with that categorization. It's not pedantry, it's the point! It's the OP that specifically says the second version might not be as readable. It's the OP that specifically cites Clean Code as inspiration for the second version. My comment is essentially that Clean Code would *not* have him write code that way. The order is actually important for readability, you read down the file like you read a newspaper article, getting more and more detail until you basically become disinterested. — Anthony Pegram, Nov 23 '11 at 22:01
Like you would not expect to read a poem backwards, neither would you expect to see the lowest level of detail as the first thing inside a particular class. To your point, *this* code takes little time to quickly sort through, but I would only assume this code isn't the only code he is going to write. To my point, if he's going to cite Clean Code, the least he could do is *follow it.* If the code is out of order, it certainly will be less readable than otherwise. — Anthony Pegram, Nov 23 '11 at 22:01
possible duplicate of [How would you know if you've written readable and easily maintainable code?](http://programmers.stackexchange.com/questions/141005/how-would-you-know-if-youve-written-readable-and-easily-maintainable-code) — gnat, Jan 09 '15 at 15:11

score 15 · Accepted Answer · answered Nov 23 '11 at 18:09

Of course, this just begs the question "What is one thing?" Is reading a line one thing and writing a line another one? Or is copying a line from one stream to the other to be considered one thing? Or copying a file?

There is no hard, objective answer to that. It's up to you. You can decide. You have to decide. The "do one thing" paradigm's main goal is probably to produce code that's as easy to understand as possible, so you can use that as a guideline. Unfortunately, this isn't objectively measurable either, so have to rely on your gut feeling and the "WTF?" count in code review.

IMO a function consisting of only a single line of code is rarely ever worth the trouble. Your printLine() has no advantage over using std::cout << line << '\n'¹ directly. If I see printLine(), I have to either assume it does what its name says, or look it up and check. If I see std::cout << line << '\n', I know immediately what it does, because this is the canonical way of outputting the content of a string as a line to std::cout.

However, another important goal of the paradigm is to allow code reuse, and that's much more objective a measure. For example, in your 2nd version printLines() could easily be written so that it is an universally useful algorithm that copies lines from one stream to another:

void copyLines(std::istream& is, std::ostream& os)
{
  std::string line;
  while( std::getline(is, line) );
    os << line << '\n';
  }
}

Such an algorithm could reused in other contexts as well.

You can then put everything specific to this one use case into a function which calls this generic algorithm:

void printFile(const std::string& filePath) {
  std::ifstream file(filePath.c_str());
  printLines(file, std::cout);
}

¹ ^{_{Note that I used '\n' rather than std::endl. '\n' should be the default choice for outputting a newline, std::endl is the odd case.}}

+1 - I mostly agree, but I think there's more to it than "gut feeling". The trouble is when people judge "one thing" by counting implementation details. To me, the function should implement (and its name describe) a single clear abstraction. You should never name a function "do_x_and_y". The implementation can and should do several (simpler) things - and each of those simpler things may be decomposed into several even simpler things and so on. It's just functional decomposition with an extra rule - that the functions (and their names) should each describe a single clear concept/task/whatever. — , Nov 24 '11 at 01:44
@Steve314: I have not listed implementation details as possibilities. Copying lines from one stream to another clearly is a _one-thing_ abstraction. Or is it? And it's easy to avoid `do_x_and_y()` by naming the function `do_everything()` instead. Yes, that's a silly example, but it shows that this rule fails to even prevent the most extreme examples of bad design. IMO this _is_ a gut feeling decision as much as one dictated by conventions. Otherwise, if it was objective, you could come up with a metric for it — which you cannot. — sbi, Nov 24 '11 at 09:59
I wasn't intending to contradict - only to suggest an addition. I guess what I forgot to say is that, from the question, decomposing to `printLine` etc is valid - each of those is a single abstraction - but that doesn't mean necessary. `printFile` is already "one thing". Although you can decompose that into three separate lower level abstractions, you don't have to decompose at **every possible** level of abstraction. Each function must do "one thing", but not every possible "one thing" needs to be a function. Moving too much complexity into the call graph can itself be a problem. — , Nov 24 '11 at 10:29

dsimcha · Answer 2 · 2011-11-23T18:48:01.570

Having a function do only "one thing" is a means to two desirable ends, not a commandment from God:

If your function only does "one thing", it will help you avoid code duplication and API bloat because you'll be able to compose functions to get more complex things done instead of having a combinatorial explosion of higher-level, less-composable functions.
Having functions only do "one thing" may make the code more readable. This depends on whether you gain more clarity and ease of reasoning by decoupling things than you lose to the verbosity, indirection and conceptual overhead of the constructs that allow you to decouple things.

Therefore, "one thing" is unavoidably subjective and depends on what level of abstraction is relevant to your program. If printLines is thought of as a single, fundamental operation and the only way of printing lines that you care about or foresee caring about, then for your purposes printLines only does one thing. Unless you find the second version more readable (I don't) the first version is fine.

If you start needing more control over lower levels of abstraction and ending up with subtle duplication and combinatorial explosion (i.e. a printLines for filenames and a completely separate printLines for fstream objects, a printLines for console and a printLines for files) then printLines is doing more than one thing at the level of abstraction you care about.

I would add a third and that is smaller functions are more easily tested. As there are probably fewer inputs required if the function does only one thing, it makes it easier to test it independently. — PersonalNexus, Nov 23 '11 at 19:59
@PersonalNexus: I somewhat agree on the testing issue, but IMHO it's silly to test implementation details. To me a unit test should test "one thing" as defined in my answer. Anything finer grained is making your tests brittle (because changing implementation details will require your tests to change) and your code annoyingly verbose, indirect, etc. (because you'll be adding indirection just to support testing). — dsimcha, Dec 04 '11 at 03:34

kevin cline · Answer 3 · 2011-11-24T16:36:34.840

At this scale, it doesn't matter. The single-function implementation is perfectly obvious and understandable. However, adding just a bit more complexity makes it very attractive to split the iteration from the action. For example, suppose you needed to print lines from a set of files specified by a pattern like "*.txt". Then I would separate the iteration from the action:

printLines(FileSet files) {
   files.each({ 
       file -> file.eachLine({ 
           line -> printLine(line); 
       })
   })
}

Now the file iteration can be separately tested.

I split functions to either simplify testing, or to improve readability. If the action performed on each line of data were complex enough to warrant a comment, then I would certainly split it into a separate function.

I think you nailed it. If we need a comment to explain a line, then it is always time to extract a method. — Roger C S Wernersson, Nov 24 '11 at 09:41

score 5 · Answer 4 · answered Nov 24 '11 at 09:45

5

Extract methods when you feel the need for a comment to explain things.

Write methods that either only do what their name says in an obvious way, or tell a story by calling cleverly named methods.

answered Nov 24 '11 at 09:45

Roger C S Wernersson

756
5
12

score 3 · Answer 5 · answered Nov 24 '11 at 01:33

Even in your simple case, you are missing details that the Single Responsibility Principle would help you manage better. For example, what happens when something goes wrong with opening the file. Adding in exception handling to harden against file access edge cases would add 7-10 lines of code to your function.

After you've opened the file, you're still not safe. It could be yanked from you (especially if it's a file on a network), you could run out of memory, again a number of edge cases can happen that you want to harden against and will bloat your monolithic function.

The one-liner, printline seems innocuous enough. But as new functionality gets added to the file printer (parsing and formatting text, rendering to different kinds of displays, etc.) it will grow and you'll thank yourself later.

The goal of SRP is to allow you to think about a single task at a time. It's like breaking a big block of text into multiple paragraphs so that the reader can comprehend the point you are trying to get across. It takes a little more time to write code that adheres to these principles. But in doing so we make it easier to read that code. Think of how happy your future self will be when he has to track down a bug in the code and finds it neatly partitioned.

I upvoted this answer because I like the logic even though I disagree with it! Provide structure based on some complex thinking about what might happen in the future is counter productive. Factorise code when you need to. Do not abstract things until you need to. Modern code is plagued by people trying to slavishly follow rules instead of just writing code that works and adapting it *reluctantly*. Good programmers are *lazy*. — Yttrill, Nov 24 '11 at 10:26
Thanks for the comment. Note I'm not advocating premature abstraculation, just dividing logical operations so that it's easier to do so later. — Michael Brown, Nov 27 '11 at 12:04

score 2 · Answer 6 · answered Nov 23 '11 at 08:35

I personally prefer the latter approach, because it saves you work in the future and forces "how to do it in a generic way" mindset. Despite that in your case Version 1 is better than Version 2 - just because problems solved by Version 2 are too trivial and fstream-specific. I think it should be done the following way (including bug fix proposed by Nawaz):

Generic utility functions:

void printLine(ostream& output, const string & line) { 
    output << line << endl; 
} 

void printLines(istream& input, ostream& output) { 
    string line; 
    while (getline(input, line)) {
        printLine(output, line); 
    } 
}

Domain-specific function:

void printFile(const string & filePath, ostream& output = std::cout) { 
    fstream file(filePath, ios::in); 
    printLines(file, output); 
}

Now printLines and printLine can work not only with fstream, but with any stream.

I disagree. That `printLine()` function has no value. See [my answer](http://programmers.stackexchange.com/questions/121298/121348#121348). — sbi, Nov 23 '11 at 18:21
Well, if we keep printLine() then we can add a decorator which adds line numbers or syntax coloring. Having said that, I would not extract those methods until I found a reason to. — Roger C S Wernersson, Nov 24 '11 at 09:38

score 2 · Answer 7 · answered Nov 23 '11 at 15:54

Every paradigm, (not just necessarily the one you cited) to be followed require some discipline, and thus -reducing the "freedom of speech"- result in an initial overhead (at least just because you have to learn it!). In this sense every paradigm can become harmful when the cost of that overhead is not overcompensated by the advantage the paradigm is designed to keep with itself.

The true answer to the question, thus, requires a good capability to "foreseen" the future, like:

I am right now required to do A and B
What is the probability, in a near future I will required to do also A- and B+ (i.e. something that looks like A and B, but just a bit different)?
What is the probability in a more far future, that A+ will become A* or A*- ?

If that probability is relatively hight, it will be a good chance if -while thinking about A and B- I also think about their possible variants, thus to isolate the common parts so that I will be able to reuse them.

If that probability is very low (whatever variant around A is essentially nothing more than A itself), study how to decompose A further will most likely result in wasted time.

Just as an example, let me tell you this real story:

During my past life as a teacher, I discovered that -on the most of student's projects- practically all of them provide their own function to calculate the length of a C string.

After some investigation I discovered that, being a frequent problem, all the student come to the idea to use a function for that. After told them that there is a library function for that (strlen), many of them answered that since the problem was so simple and trivial, it was more effective to them write their own function (2 lines of code) than seeking the C library manual (it was 1984, forgot the WEB and google!) in strict alphabetic order to see if there was a ready function for that.

This is an example where also the paradigm "don't reinvent the wheel" can become harmful, without an effective wheel catalog!

score 2 · Answer 8 · answered Nov 23 '11 at 18:02

Your example is fine to be used in a throw-away tool that is needed yesterday to do some specific task. Or as an administration tool that is directly controlled by an administrator. Now make it robust to be suitable for your customers.

Add proper error/exception handling with meaningful messages. Maybe you need parameter verification, including decisions that must be made, e.g. how to handle not existing files. Add logging functionality, maybe with different levels like info and debugging. Add comments so that your team colleagues know what's going on there. Add all the parts that are usually omitted for brevity and left as an exercise for the reader when giving code examples. Don't forget your unit tests.

Your nice and quite linear little function suddenly ends in a complex mess that begs to be splitted into separate functions.

score 2 · Answer 9 · answered Nov 24 '11 at 08:25

IMO it becomes harmful when it goes that far that a function hardly does anything at all but delegate work to another function, because that's a sign that it is no longer an abstraction of anything and the mindset that leads to such functions is always in danger of doing worse things...

From the original post

void printLine(const string & line) {
  cout << line << endl;
}

If you are pedantic enough, you might notice that printLine still does two things: writing the line to cout and adding a "end line" character. Some people might want to handle that by creating new functions:

void printLine(const string & line) {
  reallyPrintLine(line);
  addEndLine();
}

void reallyPrintLine(const string & line) {
  cout << line;
}

void addEndLine() {
  cout << endl;
}

Oh no, now we have made the problem even worse! Now it's even OBVIOUS that printLine does TWO things!!!1! It doesn't make much stupidity to create the most absurd "work-arounds" one can imagine just to get rid of that inevitable problem that printing a line consists of printing the line itself and adding an end-of-line character.

void printLine(const string & line) {
  for (int i=0; i<2; i++)
    reallyPrintLine(line, i);
}

void reallyPrintLine(const string & line, int action) {
  cout << (action==0?line:endl);
}

score 1 · Answer 10 · answered Nov 23 '11 at 08:07

Short answer... it depends.

Think about this: what if, in the future, you won't want to print only to standard output, but to a file.

I know what YAGNI is, but I'm just saying there might be cases where some implementations are known to be needed, but postponed. So maybe the architect or whatever knows that function needs to be able to also print to a file, but doesn't want to do the implementation right now. So he creates this extra function so, in the future, you only need to change the output in one place. Makes sense?

If you however are certain you only need output in the console, it doesn't really make much sense. Writing a "wrapper" over cout << seems useless.

But strictly speaking, isn't the printLine function a different level of abstraction from iterating over lines? — , Nov 23 '11 at 08:17
@Petr I guess so, which is why they suggest you separate the functionality. I think the concept is correct, but you need to apply it on a case-to-case basis. — , Nov 23 '11 at 08:21

score 1 · Answer 11 · answered Nov 24 '11 at 02:16

The whole reason that there are books devoting chapters to the virtues of "do one thing" is that there are still developers out there that write functions that are 4 pages long and nest conditionals 6 levels. If your code is simple and clear you've done it right.

NWS · Answer 12 · 2011-11-24T17:05:03.033

0

As other posters have commented, doing one thing is a matter of scale.

I would also suggest that the One Thing idea is to stop people coding by side effect. This is exemplified by sequential coupling where methods have to be called in a particular order to get the 'right' result.

edited Nov 24 '11 at 17:05

answered Nov 24 '11 at 16:56

NWS

1,319
8
17

When does the "Do One Thing" paradigm become harmful?

12 Answers12

Linked

Related