2

I'm parsing a set of XML files of different kinds (these kinds are known in advance).

These are my requirements:

  • I want an object to represent each XML document (object-xml mapping)
  • I'd rather have a different class for each kind, so that identifying the type of the XML file is equivalent to looking at the class of the instance
  • I don't know the kind in advance
  • I want some specific behaviour for each object

But for the sake of simplicity, let's consider, I just need to open different numbers, of two know kinds:

  • even number
  • odd number

In term of design pattern, I use a variation of the factory design pattern, where the afctory is in the __init__ of the top level class, and I set the correct implementaion by a changing __class__ dynamically

class Number():
    def __init__(self, val):
        # common implementation (in case of XML, I do the recursion of xml imports)
        self.value = val

        # This is also the point where I discover the type of my object
        # Hence, the constuctor is also a factory of the appropriate type
        if val % 2 == 0:
            self.__class__ = EvenNumber
        elif val % 2 == 1:
            self.__class__ = OddNumber
        else:
            # keep the default Number implementation
            pass


class EvenNumber(Number):
    def half(self):
        ''' 
        specific behaviour for even numbers
        '''
        return self.value / 2


class OddNumber(Number):
    def next_int(self):
        return self.value

if __name__ == '__main__':
    for x in (Number(2), Number(3)):
        '''
        In the main algorithm, the processing depends on the type of the object
        '''
        if isinstance(x, EvenNumber):
            print(x.half())
        elif isinstance(x, OddNumber):
            print(x.next_int())

What do you think about this approach?

I knew __class__ could be read, and was very surprised it can also be written. What do you think of changing __class__ dynamically?

Is this a well-known design pattern (it is not the factory pattern nor the state pattern nor the strategy pattern, even if it seems related to all of them)

Edit My real use-case

I want to open a XBRL taxonomy, which is made of XML linkbases. Except the generic XML Linkbase, I want to handle more precisely some of them:

  • XML Label linkbase
  • XML Definition linkbase
  • etc.

In substance, here is what I have done:

class Linkbase(etree.xml.ElementTree):
    '''
    Generic linkbase
    '''
    def __init__(self, filename):
        # parse XML
        if self.type == 'definition':
            self.__class__ = DefinitionLinkbase
        elif self.type == 'label':
            self.__class__ = LabelLinkbase

     @property
     def type(self):
        return self.find('//xpath/expression').tag

class LabelLinkbase(Linkbase):
   @property
   def labels(self):
     # find all appropriate elements and returns a list of Labels

class DefintionLinkbase(Linkbase):
   @property
   def definitions(self):
      # returns all definitions

Alternatively, I could have used a factory. I can think of something like this, but it doesn't look as elegant as the first approach.

class Linkbase(etree.xml.ElementTree):
    '''
    Generic linkbase
    '''
    def __init__(self, tree):
        self.tree = tree

     @property
     def type(self):
        return get_type(self.etree)

class LabelLinkbase(Linkbase):
   @property
   def labels(self):
     # find all appropriate elements and returns a list of Labels

class DefintionLinkbase(Linkbase):
   @property
   def definitions(self):
      # returns all definitions

def build_Linkbase(file):
    tree = etree.parse(file)
    if get_type(tree)== "defintion"
       return DefintionLinkbase(tree)
    elif get_type(tree) == "label":
       return LabelLinkbase(tree)
    else:
       return Linkbase(tree)
def get_type(tree):
    return tree.find('//xpath/expression').tag
rds
  • 123
  • 7
  • Adding [link](http://programmers.stackexchange.com/a/69383/54500) to another answer, so people can see it from here. – Eva Feb 25 '13 at 16:27

2 Answers2

1

Given your example, I would certainly recommend to define odd/even as properties. Now for the sake of answering the question, I would do the following:

class Number():
    def __init__(self, val):
        # common implementation (in case of XML, I do the recursion of xml imports)
        self.value = val

class EvenNumber(Number):
    def half(self):
        ''' 
        specific behaviour for even numbers
        '''
        return self.value / 2


class OddNumber(Number):
    def next_int(self):
        return self.value

# this is your factory
function buildNumber(val):
    if val % 2 == 0:
        return EvenNumber(val)
    else:
        return OddNumber(val)

This is closer to the factory design pattern. If you want to make buildNumber a member of a configurable NumberFactory class, you can.

Edit

Beauty and elegance is in the eye of the beholder, but I find your second approach better. This is a judgement call, but I think the first one fail to follow the principle of least surprise because the constructor returns an object of a different type (which is unexpected in most languages, including python).

The second one, however, show a sane structure:

  • a class hierarchy exposing functionality
  • a function used to generate the appropriate objects from a source

This leads to a better separation of responsibility, and more flexibility regarding the source of data. This in turns leads to easier maintenance: if you have to add one more type, it won't affect the existing code. This is known as the open/close principle, and is a good thing.

The first approach makes the base class aware of its implemeters, which is a very (very) bad thing, because abstract objects should not depend on implementation details.

Simon Bergot
  • 7,930
  • 3
  • 35
  • 54
  • Yes, sure. This was for simplciation. In my real scenario, I have different XML files, and I think it is better to separate their handling in different classes. – rds Feb 01 '13 at 10:53
  • 1
    Now, thanks for this answer. In a word, you're saying that is is better to have a separate factory, and not to use the `__class__` overriting. However, as I mentioned (too quickly) in the code comment, I don't know the type of the file until I have actually parsed it. In other words, I am not able to do `val % 2` before the object is created. – rds Feb 01 '13 at 10:55
  • I don't really get your timing issue. Correct me if I am wrong, but you could define a factory method which would have your parsed xml as an input (in a dom-like format), and build the final object from this representation. What is preventing you from doing this? Regarding the `__class__` hack, you are giving a strange behavior to the Number constructor and I think it is a bad idea. – Simon Bergot Feb 01 '13 at 11:01
  • I am maybe missing something indeed. Once I have parsed the XML with etree, I will have a `etree.xml.ElementTree`. How can the factory create a `TypeADocument` or a `TypeBDocument` from this? Either `TypeXDocument` has a field that contains the etree, but I lose direct access to the etree methods (and that would be the only field, I don't think this design is elegant). Or `TypeXDocument` inherits from `ElementTree` and then I don't know how the factory can change the type, except with `__class__` – rds Feb 01 '13 at 12:06
  • If you can modifiy the `__class__` attribute based on some logic, you can instead call the correct constructor with the same logic at the same place (by replacing your base constructor call by a real factory call.). You might read some attribute in your Elementree and make a decision based on this. If you can provide another example illustrating your problem, maybe it will be easier to communicate. – Simon Bergot Feb 01 '13 at 12:52
  • Thanks for the exchange. I have detailed more precisely my real use case. I have also provided what would be the traditional factory pattern (if I'm correct). – rds Feb 01 '13 at 13:32
  • @rds see my update. – Simon Bergot Feb 01 '13 at 13:51
0

You can do this:

class MetaNumber(type):
    subclasses = {}

    def __init__(cls, name, bases, dict):
        super(MetaNumber, cls).__init__(name, bases, dict)

        # if the subclass has an attribute named type
        # store it
        try:
            type = cls.type
        except AttributeError:
            pass
        else:
            MetaNumber.subclasses[cls.type] = cls

    def __call__(cls, value):
        # lookup the appropriate type
        subclass = MetaNumber.subclasses[value % 2]
        # substitute subclass
        return super(MetaNumber, subclass).__call__(value)

class Number(object):
    __metaclass__ = MetaNumber

    def __init__(self, val):
        self.value = val


class EvenNumber(Number):
    type = 0

    def half(self):
        ''' 
        specific behaviour for even numbers
        '''
        return self.value / 2


class OddNumber(Number):
    type = 1

    def next_int(self):
        return self.value

This way:

  1. You don't have to modify __class__ of already constructed objects
  2. There is no duplication in fetching the type, because only the __call__ method checks it
  3. Simply defining the subclass with an appropriate type class attribute is sufficient to give it support.
Winston Ewert
  • 24,732
  • 12
  • 72
  • 103