Patrick Logan reminded me of immediates, and how interesting they can get when you have 64 bits.
As a quick overview of an immediate -- typically an object in a dynamically-typed, reference-based language (like Python or Smalltalk) will be made up of a memory location. In memory you have information about the object type, instance variables, etc. So on every operation you get this pointer, find the location in memory, and do stuff.
With an immediate you put the entire object into the "pointer". To do this, you might reserve the high bit to signal that this is an immediate. If it's set, you have 31 bits left to encode the rest of the object -- more than enough for a significant set of integers, and potentially other objects (like True, False, and None). It gets exciting with 64 bits, because you now have 63 bits of information left, which gives a lot of room for storing different objects.
For instance, let's say you have a four-bit header for all Python references (which is enough to distinguish a string), and 30 bits to give the string's address (I have no idea how long addresses really have to be, but anyway...); that gives you 30 bits to do whatever you want with. You could store a hash value in there, which means that in many cases you could compare and use strings without going into the heap (memory) at all. You could store the string length (maybe only using some of the bits, and limited to strings under a certain length).
Immediates can generally only be used for immutable types, because an immediate isn't passed around by reference -- since the "reference" stores the object directly, it gets copied everytime it is re-references (e.g., by assigning it to a variable). This is fine for immutables, because multiple instances are indistinguishable from a single instance, but for mutable values it would cause problems.
Some of this can already be done with 32 bits -- this sort of optimization is (I believe) common in Smalltalk and Scheme interpreters, typically for encoding small integers. And I don't believe this is being done at all in the Python interpreter (though I don't know for sure -- someone can correct me in the comments if I'm wrong). Maybe something else for the thesis idea list
I'm pretty sure immediates would make the C code of Python pretty fugly (you'd have to catch them in all sorts of places), but it might be fun to try it out in something like PyPy
Someone tried to modify the C source to Python to use the least significant bit of a PyObect* and have immediate integers. It didn't make much difference to performance.
Maybe it was Jeff Epler? He's nuts enough to have tried something like this :-)
Also, note that not every architecture in the world is byte addressed (though this is less of an issue if you use high bits as tags, but then you need more control over how your address space is layed out than you get from C).
That's a good point. The AltiVec unit in the PPC is 128 bit aligned, it completely ignores the last four bits. malloc() on OS X is guaranteed to give 128 bit aligned pointers for this reason.
Don't forget Psyco. Psyco doesn't use tag bits, but it does "virtualize" objects as simpler datatypes.# Phillip J. Eby