7

This is something that's been bugging me for a bit now. In some cases you see code that is a series of overloads, but when you look at the actual implementation you realize they do logically different things. However writing them as overloads allows the caller to ignore this and get the same end result. But would it be more sound to name the methods more explicitly then to write them as overloads?

public void LoadWords(string filePath)
{
    var lines = File.ReadAllLines(filePath).ToList();

    LoadWords(lines);
}

public void LoadWords(IEnumerable<string> words)
{
    // loads words into a List<string> based on some filters
}

Would these methods better serve future developers to be named as LoadWordsFromFile() and LoadWordsFromEnumerable()? It seems unnecessary to me, but if that is better what programming principle would apply here?

On the flip side it'd make it so you didn't need to read the signatures to see exactly how you can load the words, which as Uncle Bob says would be a double take. But in general is this type of overloading to be avoided then?

siva.k
  • 181
  • 3
  • possible duplicate of [How would you know if you've written readable and easily maintainable code?](http://programmers.stackexchange.com/questions/141005/how-would-you-know-if-youve-written-readable-and-easily-maintainable-code) – gnat Aug 24 '14 at 20:01
  • Using "FromFile" or "FromEnumeralble" in the name isn't necessary, but the part where you're "loading into a list based on some filters" in one and not the other is the reason to name them differently: LoadFilteredWords() – JeffO Aug 25 '14 at 01:23

3 Answers3

11

I really wonder why the version with actual IO has been added to the class in the first place. What's the point here? How is that the responsibility of the class?

Why not add a method to load the words from a zipped file, or from an .xml or a .json or a .doc or an SQL database or over http or ftp or whatever? Because it's not the responsibility of the class, that's why!

Particularly the file name based method adds a dependency against a global object. Which is anything but ideal if you ask me.

So I would argue that the programming principle that would apply here is composition.

  • loader.LoadWords(someList)
  • loader.LoadWords(File.ReadAllLines(somePath).ToList())
  • loader.LoadWords(Http.get(someUrl).split(NEWLINE))
  • ...

Tying together a particular method of loading the data and a particular method of processing it should not be done in the class itself, but in another module, whose responsibility it is to carry out just that composition. Could be an IoC container or something, but it doesn't have to be as fancy as that.

back2dos
  • 29,980
  • 3
  • 73
  • 114
  • That makes a lot of sense, can you think of any reason you would want multiple overload methods that cascaded down the processing? Or do you think it's just an indication that something is being done in a class that should be considered out of its scope? – siva.k Aug 24 '14 at 21:50
  • 1
    Laziness would be a reason, but I doubt it pays off here. I also doubt overloading is rarely a good idea to start with. If you need some class to consume different kinds of data, abstract the data away behind an interface. One particular issue with overloading is that delegation becomes *very* tedious. If you just tie up the data into one nice bundle that has its operations defined upon, it is a lot easier to pass it around (and will also increase modularity). – back2dos Aug 25 '14 at 01:09
  • 1
    Maybe such convenience overloading would make sense on a facade/factory, that takes all kinds of arguments and returns an adequate implementation of said interface. But that is just sugar added on top of a robust infrastructure. It does not pervade the whole system. It only adds some ease of use at the periphery. – back2dos Aug 25 '14 at 01:12
  • Suppose version 1.0 of the API was created to accept a filename. Later it was necessary to allow words to be retrieved via other means. If one doesn't control all consumers of the API, one may deprecate the method which takes a filename, but can't safely eliminate it. The best path forward would probably be to give the method that takes data from another source a different name, but having a like-named method which accepts `IEnumerable` would be understandable. – supercat Aug 26 '14 at 21:07
  • @supercat: You are making an awful lot of assumptions here. Yes, there are scenarios where you don't have much of a choice. For reasons I already went into, it was a very questionable decision to have both loading and processing be part of the same class in the first place. – back2dos Aug 27 '14 at 06:20
  • @back2dos: I have no qualm with there being a method to build an object from a file. Requiring that the caller include code to open the file, pass the open file to the method, and then close the file requires duplicating exta code every call site, and makes the caller dependent upon the exact kind of file object being used. Further, in scenarios where a a primary document file can include references to other files which contain necessary data, the document-load routine is going to have to either know how to read *those* files, or else support a very complex interface... – supercat Aug 27 '14 at 13:34
  • ...to tell the caller about additional files it needs. The only qualm I have with the load-from-file method is that its name does not make clear that it loads from a file. – supercat Aug 27 '14 at 13:34
  • @supercat: And more assumptions. Quite obviously the method in question reads a list of strings from the file, so we aren't anywhere near the scenario you're speaking of. Also "very complex interface" is a bit of a stretch here, but I'll let that slide. As I have already said, there is nothing wrong with providing a facade that composes the file reader and the list processor, to avoid "duplicating exta code every call site". Such a class could be called `FileWordLoader`. It would be pretty clear what its `load` method does. That's the kind of clarity you get from respecting the SRP ;) – back2dos Aug 27 '14 at 17:29
  • @back2dos: The present implementation might read a list of strings from a text file, but that does not mean that does not mean that all future implementations would necessarily do so. Suppose it was decided that a future implementation should support a compressed binary data format. If the file-reading logic is encapsulated within a `LoadWordsFromFile` method, support for such a format could be added without requiring any change to the calling code [if the supplied file has a header that indicates compressed binary data, it could be parsed as such]. By contrast, if calling code had to... – supercat Aug 27 '14 at 18:15
  • @back2dos: load all words using something like `wordList.LoadWords(System.IO.File.ReadAllLines(fileName));`, adding support for binary file formats would require changing the client code. – supercat Aug 27 '14 at 18:17
  • Having a method available which reads from a file, as well as a method which accepts a list of strings, means that client code which needs to supply a list of strings can do so, but that client code which simply has a file name need not bake in any assumptions about how the method will read data from that file. – supercat Aug 27 '14 at 18:21
  • @supercat: 1. You are discussing hypothetical scenarios. 2. You really don't seem to bother reading what I write. At no point did I say there should not be an object that encapsulates the composition of simple behaviors into more complex ones. In fact I suggested there be one. My point was that the simple and complex behaviors should not be defined by the same class to start with, and should definitely not be distinguishable only by signature. Either you get that or you don't. But I feel we're wasting both your time and mine here. So just take it or leave it. Have a nice day ;) – back2dos Aug 27 '14 at 18:48
  • @back2dos: I was in *agreement* that the methods should not be distinguished by signature alone; they should be given different names. Where we differ is that in situations where the pattern: `x.doSomething(f(y))` would be very common, I would favor having `x` include a method `x.doSomethingWithConvertedObject(y)` which will invoke `x.doSomething(f(y))`. If that pattern is common, the existence of that one-liner method will save code at every call site. – supercat Aug 27 '14 at 19:14
7

This violates at least three principles of object-oriented programming, and potentially a fourth.

First, this violates the Principle of Least Astonishment. Let's assume we are given documentation for only one of the methods, and we have to infer what the other does. First:

// Reads words from a file
public void LoadWords(string)

// ???
public void LoadWords(IEnumerable<string>)

It would be very easy to look at the first, and think that the second would read words from a list of files. That would be consistent behavior. It doesn't, though. Now the reverse:

// ???
public void LoadWords(string)

// Loads the IEnumerable of strings into words
public void LoadWords(IEnumerable<string>)

It would now be very easy to look at the second, and think that the first would load a single word. Instead, I would get IO errors or potentially something totally unexpected (if the word happened to match a real file name) if I tried to use it as such. This method is hard to reason about because the behaviors are so different.


Second, this code violates the Law of Demeter. In a method, I now have to reach out and interact with the file system, which is a totally unrelated object at a completely different level of abstraction. Where before I was dealing with filtering and saving words, now I have to worry about file system code. I have to know how to get to the file system, open a file, read a file, close a file, tokenize the file contents, catch any exception any of those might incur, etc. This should not be this class's responsibility, which also shows it is a violation of the Single Responsibility Principle.


Finally, we can also violate the Open-Closed Principle. Say we need to overload the method to read from a network stream. We have to open up this class to add the additional functionality. This also exacerbates the Demeter and SRP violations.


If the methods are named more explicitly, we start to leak out implementation details about our class. In this case, the reason is that the responsibilities should not lie in our class. We should push those implementation decisions out to the class's clients. This allows them to make the decision, as they were already doing, but offers them far more flexibility and control, while alleviating your class of the burden of handling all those details internally.

back2dos' answer shows how to accomplish this through client code.

cbojar
  • 4,211
  • 1
  • 17
  • 18
0

Elaborating on @back2dos answer, there are two things going on in your code.

  1. loading words into a List based on some filters
  2. parsing words (in this case, from a file, splitting on new lines)

So I'd vote for two methods. Note - I'm a Java guy so I'm guessing about the types...

  void filterWords(IEnumerable<string> rawwords) // note - why return void instead of List?
  List<string> parseWords(InputStream in)

Whether these belong in separate classes or one class depends on how specific splitting on new lines is to this particular class. If that happens other places in your code (or you can easily imagine it happening in the future) that should be split off.

By using a stream instead of a filename in parseWords you gain some flexibility and abstraction (e.g., you could use an Http GET). Perhaps stream is not the right abstraction, if so, think about what is right.

user949300
  • 8,679
  • 2
  • 26
  • 35