Ian Bicking: the old part of his blog

Re: site-packages Considered Harmful

he more I've worked with installation and deployment and setuptools, the more I think site-packages (as in having /usr/lib/python2.4/site-packages on the path) is a bad idea,

setuptools is the problem, not site-packages. You're trying to do the work of a distribution -- managing having compatible libraries and applications all on one system. I was quite literally angry when I installed recent versions of SQLObject and saw that config.py was all of a sudden deciding to download setuptools, then setuptools downloaded formencode. It gave me a 15 second window, but that's no excuse for doing something you should never ever do. It was almost infuriating enough stop using SQLObject, and I do think that setuptools should be banned.

Here's the thing, Python tools shouldn't address problems already solved. The issue of handling multiple versions of libraries on a system is one that is far outside the scope of Python, and exists for pretty much every library and system. There are also solutions outside of the scope of Python -- the biggest being (1) naming major version differences, like pysqlite2, as modules themselves, (2) using a distribution. If your libary isn't backwards compatible, put the version in its name -- problem solved.

As for the rest of it, apt-get, PRM, emerge -- all these tools manage dependencies and conflicts very well, and they work for every application, not just python ones. They'll always do a better job than setuptools, and they handle all development platforms, unlike setuptools. Setuptools is a horrible, categorically bad idea. Stop using it right now.

Two cents.


Comment on site-packages Considered Harmful
by Ken Kinder


Distributions aren't doing this work, they aren't trying to do this work, and they are so far from doing this work that is it laughable to expect any resolution to come from that direction. They aren't even trying!

What major packaging system allows for multiple parallel installations of the same library? Not just in Python, in anything? Certainly Debian does not. The only way they do it is by changing the name of the package itself, and then because of site-packages they not only have to change the name of the distribution package, but the actual Python package. That's not something you do incrementally or lightly.

Current packaging systems are good at maintaining the global state and coupled software we've already created. That is good, because that software can be hard to maintain otherwise. But in fixing that problem they introduce many limitations, limitations that anyone who has used those systems knows about. And honestly, they don't offer that much in return. How often do you find the library you want available in a current version? In my experience it is quite uncommon.

Maybe I would be more optimistic about current distribution formats if I felt people involved in these packaging projects -- Redhat, Debian, Ubuntu developers, and so forth -- were trying a little harder to address these problems. (Well, credit to Ubuntu for trying harder than the others.) But I only find out when a library I write shows up in a distribution indirectly, and I don't get any feedback about synchronizing build processes, creating consistent metadata, or anything else. I don't see them trying to push for better standards about how packages get installed, or providing general feedback on these issues.

And distributions themselves would save a lot of work by going down exactly the path I describe. When there's a Python app you want to install, just package it up as one big bundle, and don't try to separate it out into its respective libraries. That saves everyone time and effort.

# Ian Bicking

Plus, waiting for distributions to solve it introduces a massive time lag. I'm perpetually annoyed with installation of packages, because I really don't like putting non-distribution packages into distribution directories like /usr. Putting things in /usr/local still means, essentially, that you need to be root, and giving one particular project a Python module that it needs is a massive pain. I too use virtual-python.py, and it's a solution, but (and do please take this the way it's meant) it's a nasty hack rather than a proper way to solve the problem. It ought to be easy to, say, drop TurboGears stuff in a directory of my project and have it get used, and it completely is not easy. Setuptools does not help here one little bit, because it thinks it's managing the One Central Python and has to be contorted into a per-user basis; it can't do per-project installation of extra packages without further contortions.
# Stuart Langridge

What major packaging system allows for multiple parallel installations of the same library? Not just in Python, in anything?

I'm not sure if you would consider it a "major packaging system," but GoboLinux? does this fairly well. It installs each package in it's own directory and links to the current 'default' version. But, other than that, I haven't seen anything.


# Matthew Marshall

What major packaging system allows for multiple parallel installations of the same library? Not just in Python, in anything?

Gentoo's portage has SLOTs. AFAIK this is exactly it. I have many standard libs in multiple versions on my system.

With Portage different versions of a single package can coexist on a system. While other distributions tend to name their package to those versions (like freetype and freetype2) Portage uses a technology called SLOTs. An ebuild declares a certain SLOT for its version. Ebuilds with different SLOTs can coexist on the same system. For instance, the freetype package has ebuilds with SLOT="1" and SLOT="2" [1].
# Fabian

I don't know if it counts as a "major packaging system" (I guess not if you're referring to OS systems), but RubyGems seems to be designed to do just this:


But it of course requires doing things with require since it's not built-in to the language (at this point). I can imagine a similar tact in Python giving people fits.

# ToddG

Setuptools has a very similar scope to Gems, with very similar motivations, and probably several similar techniques.

# Ian Bicking

I should also note that there is work on this kind of isolation in a general way, in the form of people setting up full OS virtualization. This is kind of heavy handed, but then that's where the work is happening, to make this a reasonably efficient and manageable way to deploy things. It still can't be as granular as a per-script setup. But it provides many of the same advantages on a larger scale.

# Ian Bicking

The idea is less to have every single version of a library installed that each package might have been developed with. The entire purpose of shared libraries is that the entire system uses one library. You can always just bundle your own private libraries with your application, which is kind of wasteful on disk space, but useful when you don't know the destination environment...

# Ken Kinder

Debian tries to limit the number of separate versions of a library in order to avoid the need to update many different versions when (I daren't say if) there is a need for a security patch. If the upstream developer keeps breaking source compatibility between versions, the library probably doesn't belong in Debian (or in any non-toy application).

# Ben Hutchings