Ian Bicking: the old part of his blog

Re: Towards PHP

A bunch of random thoughts.

Comment on Towards PHP
by Harry Fuecks

Comments:

On the DB front, why not forget ORM for a momentand try something else?

Yes, I can see the benefit of that. The DB-API is not too bad, but it can also be needlessly verbose for many cases, and even just a thin wrapper would help. Especially for people who are already very comfortable with SQL. Not that a super-simple ORM would be without purpose; I still think it just feels much nicer in many cases. But a super-simple ORM would also not be entirely sufficient for even fairly simple situations, and such a thing should also work comfortably alongside explicit SQL.

Security is an interesting problem. I think it would be nice to have a trusted (high priviledge) parent process that creates lower-priviledge subprocesses. It would be adaptive, so if user X had a lot of activity and several active processes, but then user Y's scripts got some use, we'd just kill the user X processes. You can even go down to zero processes, so that really low-usage environments have zero overhead until they are effectively woken up. Similarly you cull old processes, and processes that are not responding in a timely manner, or maybe those that are taking up too much memory. Though memory seems hard to track -- since PHP throws everything away memory this isn't as big a deal there. Simply culling all processes after some set number of requests would help. Processes would still have to "warm up" -- responding to their first request rather slowly. But in terms of scaling that doesn't seem too bad, and in terms of user experience an occassional slow request isn't the end of the world.

With threads, you simply don't have this kind of control. Dealing with dead threads and other nuisances is a real pain in the butt, and there's simply no good solution. At least with multiple processes you have a chance. There's always the Windows issue -- I really don't know what the process situation is like there. There's no fork, but is it unworkable to simply spawn processes the hard way?

It also seems like keeping it separate from Apache opens up a lot of flexibility -- if for no other reason than it's easier to do this work in Python than in C. (Though I suppose you could implement it in mod_python, even though mod_python wouldn't actually be running any user code... I'm not sure what the benefits would be, but it seems like an interesting strategy).

As I think more about this, I realize this is all what FastCGI does. Every single feature I describe here, I think. Maybe what is needed is simply a really good, well specified implementation of the full FastCGI featureset.

# Ian Bicking

re: FASTCGI

You would be better served to consider SCGI. Go to http://www.fastcgi.com/dist/ and check out the last-modified dates. The most recent stable release was Jan 19, 2003. The most recent SNAPSHOT is April 14, 2004. Contrast this with SCGI ( last release Feb 2, 2006 ) http://www.mems-exchange.org/software/scgi/scgi-1.10.tar.gz/scgi-1.10/CHANGES SCGI is vibrant and python-centric. mod_scgi is robust and stable for Apache 1.3x and 2.0x scgi_server.py is small, well-written, and easily extensible.

re: threads You are absolutely right

re: per-user processes This is problematic. You will need a master-process with root-privileges to fork/setuid all these lower-privileged process. This is a big security hole unless you do it exactly right ( Note the apparent deadend of the apache2 perchild mpm. ) This program will be talking to the internet. Running as root is hubris. It would be better to serve all requests for all hosts as one unprivileged user.

re: PHP I loathe PHP, but I believe that one of PHP's greatest virtues is as follows: Consider an apache server with 200 virtual-hosts. Each virtual host has 100 php-scripts. The run-time cost of these 20,000 php-scripts is only that whatever pages the server is currently serving. If the request rate is 1 page/second or less the load-average is 0-0-0. It doesn't matter whether there are 100 virtual-hosts or 1. It doesn't matter whether there are 20,000 distinct php-scripts available or 1.

A tomcat installation with 20,000 distinct .jsp pages by comparison would be in swap-storm standing still.

Most python web-application frameworks tend to favor the tomcat model. They work great for complex apps on a dedicated machine.

Providing simple and robust dynamic web-pages for scores of unrelated virtual-hosts with low resource utilization, not so much.

# Christopher Mulcahy

re: FASTCGI

You would be better served to consider SCGI. Go to http://www.fastcgi.com/dist/ and check out the last-modified dates. The most recent stable release was Jan 19, 2003. The most recent SNAPSHOT is April 14, 2004. Contrast this with SCGI ( last release Feb 2, 2006 ) http://www.mems-exchange.org/software/scgi/scgi-1.10.tar.gz/scgi-1.10/CHANGES SCGI is vibrant and python-centric. mod_scgi is robust and stable for Apache 1.3x and 2.0x scgi_server.py is small, well-written, and easily extensible.

re: threads You are absolutely right

re: per-user processes This is problematic. You will need a master-process with root-privileges to fork/setuid all these lower-privileged process. This is a big security hole unless you do it exactly right ( Note the apparent deadend of the apache2 perchild mpm. ) This program will be talking to the internet. Running as root is hubris. It would be better to serve all requests for all hosts as one unprivileged user.

re: PHP I loathe PHP, but I believe that one of PHP's greatest virtues is as follows: Consider an apache server with 200 virtual-hosts. Each virtual host has 100 php-scripts. The run-time cost of these 20,000 php-scripts is only that whatever pages the server is currently serving. If the request rate is 1 page/second or less the load-average is 0-0-0. It doesn't matter whether there are 100 virtual-hosts or 1. It doesn't matter whether there are 20,000 distinct php-scripts available or 1.

A tomcat installation with 20,000 distinct .jsp pages by comparison would be in swap-storm standing still.

Most python web-application frameworks tend to favor the tomcat model. They work great for complex apps on a dedicated machine.

Providing simple and robust dynamic web-pages for scores of unrelated virtual-hosts with low resource utilization, not so much.

# Christopher Mulcahy

re: FASTCGI

You would be better served to consider SCGI. Go to www.fastcgi.com/dist/ and check out the last-modified dates. The most recent stable release was Jan 19, 2003. The most recent SNAPSHOT is April 14, 2004. Contrast this with SCGI ( last release Feb 2, 2006 ) www.mems-exchange.org/software/scgi/scgi-1.10.tar.gz/scgi-1.10/CHANGES SCGI is vibrant and python-centric. mod_scgi is robust and stable for Apache 1.3x and 2.0x scgi_server.py is small, well-written, and easily extensible.

re: threads You are absolutely right

re: per-user processes This is problematic. You will need a master-process with root-privileges to fork/setuid all these lower-privileged process. This is a big security hole unless you do it exactly right ( Note the apparent deadend of the apache2 perchild mpm. ) This program will be talking to the internet. Running as root is hubris. It would be better to serve all requests for all hosts as one unprivileged user.

re: PHP I loathe PHP, but I believe that one of PHP's greatest virtues is as follows: Consider an apache server with 200 virtual-hosts. Each virtual host has 100 php-scripts. The run-time cost of these 20,000 php-scripts is only that whatever pages the server is currently serving. If the request rate is 1 page/second or less the load-average is 0-0-0. It doesn't matter whether there are 100 virtual-hosts or 1. It doesn't matter whether there are 20,000 distinct php-scripts available or 1.

A tomcat installation with 20,000 distinct .jsp pages by comparison would be in swap-storm standing still.

Most python web-application frameworks tend to favor the tomcat model. They work great for complex apps on a dedicated machine.

Providing simple and robust dynamic web-pages for scores of unrelated virtual-hosts with low resource utilization, not so much.

# Christopher Mulcahy