Ian Bicking: the old part of his blog

Bundles Schmundles

There's another packaging article up on Slasdot, this time about the ROX Zero Install system. Zero Install is kind of like MacOS X application bundles, though with an important added feature: instead of explicitly copying bundles into your application folder, you only run them remotely (NFS?), with heavy caching so that effectively you are running them locally (but updates can happen magically). As usual there's lots of people talking about how wonderfully simpler MacOS packages are compared to Linux.

In some ways this seems better than Mac's system, because it saves a lot of management issues, including dependency (since one bundle can refer to another bundle, and installation is implicit). Though it opens up a lot of issues as well -- what happens when different updates conflict, e.g., package A and B both depend on package C, but A and B both depend on different versions. This will be quite common as the packages are incrementally updated by their respective authors. Explicit updating of installed software gives an opportunity to manage these conflicts.

Ultimately, I think bundles are the wrong solution to packaging. They handle the simple case, and that's not very interesting. But part of why they appear to be a good solution on MacOS X is because that is largely a proprietary platform. Proprietary applications tend to be more self-contained, and include all their dependencies, because proprietary software makers can't cooperate with each other (there's just little incentive). Open source applications, however, are part of a whole community of software, and have extensive dependencies and interactions. Because there's no commercial overhead, the factoring of applications also tends to be more fine-grained.

But even in the proprietary world, MacOS's system is deficient. Installers are still fairly common -- perhaps those installers just place the bundle in the right location, but that is impossible for me to know. And there's no common way to choose or understand important configuration settings: Which application is the default handler (for a mime type, email, etc)? If I install a plugin, how does it know which browsers I have installed? If I install a service, how do I know it doesn't conflict with other services? (Only one service per port, after all)

Obviously much of Linux software can't be put in bundles currently. Part of the ideal of bundles is that they encourage decoupled application design, so even if we can't do it now, maybe the discipline would make our software better. But decoupling simply isn't possible, and the entire point of a packaging system is to robustly deal with the times when decoupled applications aren't possible: to deal with the system. Naive systems like bundles simply aren't robust with respect to conflicts and application interactions, even as simple as they seem from the outside. Software that isn't robust isn't user friendly, especially when it comes to system management (a task every user has and no user wants).

If packaging is seen as too complicated, I think that's because we pay too much attention to the filesystem. The user has no place in the modern filesystem. There's nothing useful for them to do there. There are some select places where they can do something -- /etc, some parts of /var (e.g., /var/www), and /home. That's it. (And while paths differ, the same is largely true for Windows and MacOS as well.) All the other parts of the filesystem are managed by specific applications (and even /etc should really be managed as well). Even inside /home there are managed areas -- Library on MacOS, or a pile of .whatever files in Linux.

The filesystem isn't even any good for users where they should have control. Files that users manage need to be versioned and well indexed -- things that aren't particular important for application files. The dumb pile-of-bits filesystems that we use are perfectly fine for storing program files and associated data, and there's no reason we should forgo that. But we shouldn't ask more of that pile of bits than it can provide. Bundles turn the file system into a package manager. But it should be fairly obvious that the file system (or the Finder or ROX filer) isn't going to be the kind of rich interface necessary for this most essential of system administration tasks.

Created 04 Apr '04
Modified 14 Dec '04

Comments:

An important feature of application packages is the separation of authority between the OS and the application. The installer needs to protect the system from the vendor. Most of the problems with Windows applications arise because the vendor-provided application installers have always been allowed to walk all over the operating system. RPM was a great leap forward, since the executable code was provided by the system and the package contained only information. Essentially, packages are interpreted by the RPM runtime. This leaves control in the hands of the root user, not the app provider. The Amiga installer was actually a lisp dialect interpreter! One could argue the the usefulness of the the Gentoo Portage system comes from layering a standard install api on code from varied sources and allowing the system to set common options across all installs easily. I'm sure Debian users would preach the joys of that system as well.

Dependencies are clearly a hard problem, especially with binary packages, since the unix idiom is that all options and dependencies need to be configured at compile-time. That problem really centers around code design, not packaging. The best that a packaging system can do is NOT MAKE IT WORSE. A good packaging system makes a system manageable by putting the authority in the hands of the operator. A weak or nonexistant packaging system leads to dll hell, packages breaking other packages and brittle, unstable systems.
# Jeff Sacksteder

Note that RPM and dpkg both allow for arbitrary code to be run on package installation or removal -- but because a lot of the process is automated by the tool, these scripts can be more limited in scope. This doesn't tend to be a problem, because package authors don't go too crazy -- it's used for things like byte compiling code, doing database upgrades, updating configuration scripts, etc. But if, say, Gator (or whatever they're called now) created a Linux package of their product, they could be nearly as evil as they are on Windows.
# Ian Bicking

I want a packaging system that relies on the features of a transactional filesystem. OK, I want a transactional filesystem too. ;-) This would make it possible to install packages without fear of fucking things up too badly on a running system, as I could just roll the back the transaction implied by the install. I can dream, anwyay.

Also, I think current Linux packaging systems like RPM and dpkg are too centered around the "one instance of this piece of software installed in a canonical location" theme. This is really nice for lots of "singleton" things that you don't really want to think too hard about (for instance, bash), but it kind of sucks if you want to install more than one instance of a piece of software on a machine or want to install as a nonroot user. Granted, it's possible to create "retargetable" RPMs and spell explicit installation locations in different ways, but most dpkg and RPM packages aren't created in ways that allow you to do this easily. I also don't want to need to install packages as the root user just to maintain the shared package metadata database... I'd rather install as a "normal" user and have per-user metadata databases that allowed me to choose the files on which the packages' dependencies relied interactively. I'll stop now, because if I don't, I'll have to do something about it. ;-)
# Chris McDonough