Ian Bicking: the old part of his blog

Re: What I've Been Up To

html5lib is cool too as an HTML parser, and important as a reference implementation, but I'll mostly be waiting for that work to filter back into something like libxml2.

I should stress html5lib is not a reference implementation in the traditional sense of the word; it's the first public implementation of the HTML 5 spec and has a pretty valuable set of tests but it has no other status at all; any disagreements with the spec are bugs in html5lib. In particular testing implementations against html5lib output is not recommended.

There is also some work going on on making a fast implementation of the HTML5 spec parsing algorithm in C; I will be first in line to make Python bindings when that work is complete (and no doubt others will come up with other bindings too).

Comment on What I've Been Up To
by jgraham