Ian Bicking: the old part of his blog

Distributing dependencies

When I cut a release for WSGIKit, I want the experience to be pretty easy -- you download the package, run a command, you got a server you can play with. It's pretty close to that right now, you can run:

./server.py --webkit-dir=../examples/todo --server=wsgiutils \
    --reload -v -D

To do that I have to ship some dependencies, like a server (WSGIUtils is small, but now with TwistedWeb 2 it's pretty reasonable as well). Maybe even a couple of servers. And if I want to make use of other people's middleware or libraries, I don't want to import them into my repository and namespace. I just want to distribute the files.

One option is to use a zip file, which Python can import from. You just add the zip file to sys.path. For pure Python packages this works okay -- and if it's not pure Python it's not a simple install anyway and I don't mind leaving it out and forcing the user to install it on their own. There are some problems, however -- Eggs try to solve this, but there's still a little work to do there.

Then I realized I didn't really like zip files. Zip files nested in zip files, what does that buy? And anyway, I want users to have access to all the source unpacked, regardless of its origin -- this is the transparency we like in an open source framework. They shouldn't need to look at the source. But I actually want them to, since then they are more likely to contribute, and to generally feel comfortable with the infrastructure.

So, my plan is to put all the external packages in a 3rd-party directory. I named it so it's an invalid Python package, and people can't import from it directly. Instead for each package there's a subdirectory that is added if we need to load the package, so 3rd-party/wsgiutils-files/ is added to sys.path if wsgiutils can't be loaded the normal way. I'm 100% okay with adding 100K to the download to make the experience more pleasant -- heck, I'm probably okay with something much larger than 100K, it's only disk space and bandwidth, and both of those are cheap these days. I made a little module to support this -- no magic, nothing very fancy at all.

Created 30 Mar '05

Comments:

I would rather see a separate tarball labeled wsgikit-demo-A.B.C.tar.gz. If I understand WSGIKit correctly, it's a framework, not an application. If you were distributing a fully-independant, batteries-included application for Windows, then including all of the dependencies in a convenience package might be worth it. However, as a framework library, I would be disappointed to see cruft in the tarball. It would simply take up mirror-space and filesystem space for those that wish to create binary packages of your library.

There was recently a discussion on the gnu-prog list about trying to formalize dependency requirements for applications in the form of some XML or RFC-822 document in the top-level directory of the tarball. Most of the ideas where shot down over implementation details, duplication of effort complaints, and unnecessarily making the build process too complicated.

Perhaps if you could leverage PyPI information regarding package versions and locations, you could bootstrap the retrieval and installation of the necessary software for your demo; something you could optionally tie in to setup.py.

Really, what it boils down to is that you need to inform your user of the necessary information rather than trying to hold his/her hand throughout the process.

# Chad Walstrom

I want a pleasing immediate experience, so informing the user isn't really what I'm looking to do. OTOH, it probably makes sense to package two tarballs. Either way the package includes just one setup.py, and it doesn't install any third-party packages, it would only make a difference when you run the server out of the unpacked directory.

OTOH, Twisted is looking pretty large -- 800K compressed, to include twisted core, twisted.web, and Zope interfaces. Which is disappointing. WSGIUtils and a couple other servers are only a file, so it's easier.

# Ian Bicking

eggs aren't tightly bound to zip files. If it's a directory, instead of a zip file, it works identically.
# Bob Ippolito