Personally, yes, I believe, they should be tested. A lot of people say that one of the bad things about documentation comments is that they are not checked by the compiler. Well, here you have at least a portion of your documentation comment which can be checked by a testing framework, so do it!
There are some interesting directions in which this idea can be developed: the Python community has a library called doctest
, which can automatically extract usage examples (which in turn are (or look like) just copy&pasted REPL sessions) from documentation and turn them into unit tests.
And a more extreme example is the book Agile Web Development with Ruby on Rails, which is an entire book (not just a small documentation comment), where every code example is fully executable. And it is executable in two modes: you can either run the book against a version of Rails, and the framework will ensure that all output is the same as the one printed in the book; in this mode, the book acts as a regression suite (and informal spec) for Rails. And in the other mode, you can have the framework automatically update the results printed in the book to match the ones from Rails. So, you can either use the code examples to ensure that Rails doesn't change in backwards-incompatible ways, or you can use it to ensure that the examples in the book are up-to-date with backwards-incompatible changes in Rails. IOW: the framework can automatically catch all differences for you and you can consciously decide for each one whether to change Rails to match the book (in which case you already have a failing test case to fix) or change the book to match Rails (and the framework can do the latter for you).
So, if you execute your code examples in your documentation, you will catch cases where the documentation no longer matches the code, and you can decide whether that was an intentional change (and update the documentation accordingly, even automatically given the right tools), a bug in the documentation, or a bug in the code (in which case the example already acts as a failing test case).