Kevin: I don't see how state machines will help. Without threads the GIL doesn't matter, but you'll still be wasting that second processor. Removing the GIL is about making threads utilize the second processor, where the only other way is to use multiple processes.
Steven: I don't really know what your data set looks like -- if there's a large piece of data that your worker threads/processes need access to, then I imagine it could be a problem. But I'd expect there isn't a big data set, you simply need the worker threads/processes to search for dependencies, which you do by searching for specific patterns. The actual communication of data isn't too great -- filenames for the workers to process, returning lists of files they depend on.
And even if you can't use fork, you can still spawn new processes (if not as quickly). Like a thread pool, you then use a process pool, where worker processes wait around for something to do. If you have a persistent server of some sort, this shouldn't be a problem.
But the problem persists if you want the entire process to start up quickly. Python isn't a quick startup, and the harder you work the longer it takes -- building up process pools and the like only make it worse. And since you're probably concerned about latency, not throughput, to a degree this may describe SCons'... anyway, I suppose my boo hoo comment was a little callous ;) But there's always a clever solution, and it's easier to implement those solutions in Python than in the interpreter (and probably every C extension for Python).