Ian Bicking: the old part of his blog

Doctest and documentation

doctest (see also this presentation) is a very clever way to do Python testing. Especially for smaller, highly decoupled functions doctest is wonderfully easy to develop for. However, it can be a bit of a pain -- it is extremely literal. It expects the output to be exactly what you say it will be, down to the smallest bit of whitespace. I think this sucks.

This is justified with the idea that doctest tests are documentation as well as tests. The documentation should be completely correct, and people shouldn't be surprised when they try the examples out themselves.

This does not give the reader the credit they deserve. Readers are highly capable of understanding that there are imporant aspects to documentation, and there are incidental aspects. In <SomeObject instance at 0x401edc0c> it's pretty easy to figure out that 0x401edc0c is not an important value. The accepted way to deal with this is to make sure such a value never gets printed. This also applies to all dictionaries, which have an unpredictable key order, and output that includes newlines, which doctest can't parse.

Another suggestion in future directions for doctest (from the presentation) is that there be a series of pretty-printers that display values in a canonical form, removing addresses, ordering dictionary keys, etc.

This is absolutely the wrong idea if you are creating documentation. Those explicit pretty printers or funny expressions (like foo() == {"Hermione": "hippogryph", "Harry": "broomstick"}, as opposed to printing the output of foo()) are a distraction. This is fine if you are using doctest as a unittest replacement (which is a valid way to avoid unittest's verbosity), but it's no good in documentation.

I think instead that doctest should support both pluggable pretty printers -- pretty printers that aren't explicitly invoked in test expressions -- and that there also be pluggable string equality functions. (Actually I'm unsure about the pretty printers, but very sure about the string equality functions.)

For instance, I have a bunch of code that tests HTML output. I don't care about the order of the attributes, so I have a function that parses the HTML and ignores attribute order. That this happens magically behind the scenes doesn't bother me for documentation -- I expect my readers to ignore the same details that the equality function ignores. I don't want to use some alternate way to represent HTML -- I don't expect the readers to understand some funny nested structure. And the entire point of the narrative form of doctest is defeated if I use the equality function explicitly, because my output becomes meaningless while all information is stored in the expressions. That's awkward and ugly.

In other cases, people may be using floats but aren't concerned with the fact that binary-encoded floats are unpredictable (e.g., 1.9 is displayed as 1.8999999999999999). This detail about float implementations matters a lot in some circumstances, but it should be up to the documentation author whether they want to highlight that difference. For this case an equality test might look like:

def floatEqual(s1, s2):
    try:
        return float(s1) == float(s2)
    except ValueError:
        return False

Wildcards are another useful way to highlight the interesting part of the output. I've used ... as a wildcard that's documentation-friendly, so you might use the output <SomeObject instance at ...>.

I played with some of these ideas in doctesthacks which, as the name implies, is a bit of a hack. Unfortunately doctest wasn't designed to be modified in this way.

(I was reminded of these testing issues after reading about std.utest, which is a unittest alternative.)

Update: maybe some of these changes are in the works?

Created 08 Aug '04
Modified 14 Dec '04

Comments:

There is a recent spurt of development of doctest in python CVS:

http://cvs.sourceforge.net/viewcvs.py/python/python/dist/src/Lib/doctest.py

(A much appreciated spurt.)
# Michael