Atom has an extension for threading (via). This is great; when I was building an importer for this blog (from another blog software where I wrote the export by accessing the database directly) I used Atom, but had to invent something to match comments up to their respective posts. Relating comments to their posts was the only place where there seemed to be information loss in dumping the entire blog to Atom. So I made
So, the extension seems to look like this:
<feed xmlns:thr="http://purl.org/syndication/thread/1.0"> <entry> <thr:in-reply-to ref="post-id" /> .... </entry> </feed>
With this I think most blogs can be 100% represented by an Atom model.
Well... another problem is that many blogs use something other than HTML for their content; something like Restructured Text, Textile, Markdown, etc. To be an accurate representation, the <content> should contain that format, or multiple forms of the content should be presented, with one marked as editable. Maybe the simplest solution to this is to store only the canonical representation (as, say, text/x-restructured-text) and render the feed differently for actual readers. But it would be nicer if this wasn't necessary.
There's also an extension in that spec for giving the comment feed URL:
<link rel="replies" href="comment-feed.xml" />
This doesn't really do anything for blog publishing software, just for readers, but it could be nice; having per-entry comment feeds isn't very interesting right now because the subscription process is not well automated -- the typical blog subscription process is way too heavy when you are just curious about followup comments that might last for a week at most. This gives smart readers an opportunity to treat this case specially.
A while ago there was some talk on the Pylons list about blog publishing software, and I brainstormed a kind of distributed set of pieces to implement a blog. The Atom Publishing Protocol and an Atom store could potentially be all the persistence you need, though I'm unclear on how you'd do general querying and reporting with the APP. Querying probably is best implemented just with ad hoc systems for now.
Have you looked at demokritos?
having per-entry comment feeds isn’t very interesting right now because the subscription process is not well automated
Another problem is that it scales very poorly if a reader wants to stay informed of all comments on a weblog. You don’t know when someone will post a comment on a long-silent entry, so you would usually need to poll them all.
The APP offers enough introspection to guide the posting of Atom entries and collections, but it is silent on querying. As you say anyone can implement an ad hoc query interface against their Atom store. No surprise that Google has offered a query and authentication interface: GData.