Ian Bicking: the old part of his blog

Decomposing Document Generation

The Python code documentation systems available are... scattered. And not very good. And definitely not well maintained. Lately when I see a hard problem that isn't getting solved well, I like to think of smaller problems that maybe could be solved.

Here's a rough outline of what I think a Python documentation generation system should look like:

I prefer a system based on static publishing because it's easy to maintain, and works well with systems like SourceForge. But in theory pieces of this system could be dynamic.

This bears some similarity to how PythonDoc is laid out, but it hasn't been clear to me how that project fits all these pieces together. Also some portions don't seem pluggable; e.g., it prefers its own markup. But that might be a misperception.

Created 30 Jun '06

Comments:

I think the issue is that you can't really generate useful documentation from code, so I think any system designed to do so will be only slightly useful at best. I'd rather spend my time on quality hand-written docs and let those interested in details just read the source.

# Jacob Kaplan-Moss

I think reference documentation taken from docstrings can be useful in some situations. It's also way easier to keep accurate in my experience. All of which isn't to say that it is or should be the complete documentation. This is why the factoring I propose has documentation generated from multiple sources, including static text, and potentially text in different formatters. So a document extractor looks at the source and gets docstrings and maybe comments and whatnot. Another looks for .txt files, and maybe renders them via some configured markup (reST, Markdown, etc). Another might just pass through already-written HTML. This factoring doesn't propose any one right way to get the content, or even that one project will use just one content generator.

# Ian Bicking

I dunno. Having looked at

http://www-cs-faculty.stanford.edu/~uno/cweb.html

and some of its beautiful output, I'd opine that documentation, like the code itself, requires effort, and is worth it.

# Chris Smith

Note that there is a CWEB-tool written in Python called LEO. http://webpages.charter.net/edreamleo/front.html

It supports code written in Python and several other languages and it also supports several variations of CWEB-like systems. One really nice thing about using LEO is that you can keep adding documentation as you work. No need for a separate history file. This does mean that you end up with code that has huge amounts of comments, but since 99% of the time you are running from .pyc files, it doesn't matter. And since LEO allows you to structure your documentation in any way you want, you can do things like integrate bugfix documentation into the document set. If you see that a particular file was modified to fix bug X-301 then you can easily look up bug x-301 and see what files were affected and what notes the programmer added.

# Michael Dillon

This is a typical problem with FOSS, and one worth solving well, otherwise there'll just continue to be many fragmented solutions. I think the future lies in integrating code, developer documentation, tests, and requirements in such as way that there's minimal duplication.

# Bjørn Stabell

My pydoctor tool (google for it...) does some of this, but hardly all of it. It's mainly targeted at documenting twisted so far, and you can see its documentation for that at:

http://starship.python.net/crew/mwh/apidocs-s/

It works by analyzing a bunch of modules or packages into this fairly large ball of information called a "System" (which can be pickled) and then working over that to produce HTML. It's at least somewhat pluggable. It supports epytext (epydoc's markup system) in docstrings, and you can use that to link to stuff reasonably freely.

I hadn't thought about integrating it into distutils before, that's quite a neat idea.

# Michael Hudson