14

The problem

Let's say I have a class called DataSource which provides a ReadData method (and maybe others, but let's keep things simple) to read data from an .mdb file:

var source = new DataSource("myFile.mdb");
var data = source.ReadData();

A few years later, I decide that I want to be able to support .xml files in addition to .mdb files as data sources. The implementation for "reading data" is quite different for .xml and .mdb files; thus, if I were to design the system from scratch, I'd define it like this:

abstract class DataSource {
    abstract Data ReadData();
    static DataSource OpenDataSource(string fileName) {
        // return MdbDataSource or XmlDataSource, as appropriate
    }
}

class MdbDataSource : DataSource {
    override Data ReadData() { /* implementation 1 */ }
}

class XmlDataSource : DataSource {
    override Data ReadData() { /* implementation 2 */ }
}

Great, a perfect implementation of the Factory method pattern. Unfortunately, DataSource is located in a library and refactoring the code like this would break all existing calls of

var source = new DataSource("myFile.mdb");

in the various clients using the library. Woe is me, why didn't I use a factory method in the first place?


Solutions

These are the solutions I could come up with:

  1. Make the DataSource constructor return a subtype (MdbDataSource or XmlDataSource). That would solve all my problems. Unfortunately, C# does not support that.

  2. Use different names:

    abstract class DataSourceBase { ... }    // corresponds to DataSource in the example above
    
    class DataSource : DataSourceBase {      // corresponds to MdbDataSource in the example above
        [Obsolete("New code should use DataSourceBase.OpenDataSource instead")]
        DataSource(string fileName) { ... }
        ...
    }
    
    class XmlDataSource : DataSourceBase { ... }
    

    That's what I ended up using since it keeps the code backwards-compatible (i.e. calls to new DataSource("myFile.mdb") still work). Drawback: The names are not as descriptive as they should be.

  3. Make DataSource a "wrapper" for the real implementation:

    class DataSource {
        private DataSourceImpl impl;
    
        DataSource(string fileName) {
            impl = ... ? new MdbDataSourceImpl(fileName) : new XmlDataSourceImpl(fileName);
        }
    
        Data ReadData() {
            return impl.ReadData();
        }
    
        abstract private class DataSourceImpl { ... }
        private class MdbDataSourceImpl : DataSourceImpl { ... }
        private class XmlDataSourceImpl : DataSourceImpl { ... }
    }
    

    Drawback: Every data source method (such as ReadData) must be routed by boilerplate code. I don't like boilerplate code. It's redundant and clutters the code.

Is there any elegant solution that I have missed?

Heinzi
  • 9,646
  • 3
  • 46
  • 59
  • 6
    Can you explain your issue with #3 in a bit more detail? That seems elegant to me. (Or as elegant as you get while maintaining backwards compatibility.) – pdr Apr 29 '13 at 14:46
  • I'd define an interface publishing the API, and then reuse the existing method by having a factory creating a wrapper around your old code and new ones by having the factory creating corresponding instances of classes implementing the interface. – Thomas Apr 29 '13 at 14:51
  • @pdr: 1. Every change to method signatures has to be made at one more place. 2. I can either make the Impl classes inner and private, which is inconvenient if a client wants to access specific functionality only available in, e.g., an Xml data source. Or I can make them public, which means that the clients now have two different ways of doing the same thing. – Heinzi Apr 29 '13 at 15:00
  • 2
    @Heinzi: I prefer option 3. This is standard "Facade" pattern. You should check if you have really to delegate *every* data source method to the implementation, or just some. Perhaps there is still some generic code which stays in "DataSource"? – Doc Brown Apr 29 '13 at 15:11
  • It's a shame that `new` isn't a method of the class object (so that you could subclass the class itself — a technique known as metaclasses — and control what `new` actually does) but that's not how C# (or Java, or C++) works. – Donal Fellows Apr 29 '13 at 18:32
  • Why couldn't you make your DataSource in 3 a wrapper and populate its impl property with one of the implementations that you create and pass in from outside? – Amy Blankenship Apr 30 '13 at 15:25

2 Answers2

12

I would go for a variant to your second option that allows you to phase-out the old, too generic, name DataSource:

abstract class AbstractDataSource { ... } // corresponds to the abstract DataSource in the ideal solution

class XmlDataSource : AbstractDataSource { ... }
class MdbDataSource : AbstractDataSource { ... } // contains all the code of the existing DataSource class

[Obsolete("New code should use AbstractDataSource instead")]
class DataSource : MdbDataSource { // an 'empty shell' to keep old code working.
    DataSource(string fileName) { ... }
}

The only drawback here is that the new base-class can't have the most obvious name, because that name was already claimed for the original class and needs to remain like that for backwards compatibility. All the other classes have their descriptive names.

Bart van Ingen Schenau
  • 71,712
  • 20
  • 110
  • 179
  • 1
    +1, thats exactly what was coming to my mind when I read the question. Though I like option 3 of the OP more. – Doc Brown Apr 29 '13 at 14:59
  • The base class could have the most obvious name if put all the new code into a new namespace. But I'm not sure that's a good idea. – svick Apr 29 '13 at 15:05
  • The base class should have the suffix "Base". class DataSourceBase – Stephen Aug 14 '14 at 07:27
6

The best solution will be something close to your option #3. Keep the DataSource mostly as it is now, and extract just the reader part into its own class.

class DataSource {
    private Reader reader;

    DataSource(string fileName) {
        reader = ... ? new MdbReader(fileName) : new XmlReader(fileName);
    }

    Data ReadData() {
        return reader.next();
    }

    abstract private class Reader { ... }
    private class MdbReader : Reader { ... }
    private class XmlReader : Reader { ... }
}

This way, you avoid duplicate code, and are open to further extensions.

nibra
  • 688
  • 3
  • 10
  • +1, nice option. I still prefer option 2, though, since it allows me to access XmlDataSource-specific functionality by explicitly instantiating `XmlDataSource`. – Heinzi May 04 '13 at 16:43