Ian Bicking: the old part of his blog

My Tripod of Web Testing

So, now that I'm using Selenium (well, planning to), I'm thinking about our all-around testing system. There's unit testing -- but that's not web-specific so I won't talk about that here.

Selenium seems good for acceptance and cross-browser testing. Hopefully with the right toolset even non-programmers can start building and running testing scripts on applications.

There's also a "Driven" mode for Selenium -- basically a process drives the Selenium Javascript code. I'm not 100% clear on how the communication is done, but anyway. If your Selenium test looks like:

open http://myhost.com/app  
click //a[text()-'Next']  
verifyTextPresent Page 2  

Then your drive code might look like:

selenium.open('http://myhost.com')
selenium.click("//a[text()='Next']")
selenium.verifyTextPresent('Page 2')

And you could put that in unittest-based tests or whatever. I'm not that excited about that -- driving browsers on an unattended basis seems like a bit of a pain. Which perhaps is where Twill fits in -- in fact, Twill could even take something looking very much like Seleniums commands, and provide extra value like HTML validation, link checking, performance testing, etc., all in a more controlled environment. More than just a place for both, I see them working in concert.

But two testing strategies don't provide a tripod of stability, and metaphorical stability is top on my list.

So the third testing strategy is to make sure the applications are happily humming away when they are live and no one is looking at them. I don't think it's reasonable to run acceptance tests on live applications -- too much valuable state. So instead I want to start adding self-diagnostic code to those applications. This code will check database connections, make sure files are all in place, etc. Even just a "yes I'm alive" response is a big improvement over nothing. Each of these will be a page, in an out-of-the-way location.

Then we point a monitoring daemon at it. I like the look of Mon, but our sysadmin has experience with Big Brother. So I'm not sure exactly what we'll use (opinions welcome). Either way, I want all of our applications to be pinged on a regular basis, with notification of problems. This will turn into a total system check -- if the network is down, if Apache is down, or if our application server is down, we'll get a report.

But wait, that is not all! No tripod is complete without a fourth leg! The last part is handling all exceptions, no matter who triggers them. Webware can email all exceptions to you -- nice, and easy to configure. We have something ad hoc that does the same in Zope -- a bit ugly and not so easy (what's up with that?!), but it's there. I'm working on a refinement here as well; it won't apply to every system we have -- it can't be in place on a PHP application, for instance -- but it's an exception reporter that can be used outside of a web context, while providing all the contextual information that both Webware and Zope's error reports have. It can also work on background tasks, like cron jobs and scheduled tasks. Extra information (beyond the normal traceback) is important for post-mortem debugging, and so that no error -- no matter how small! -- will go forgotten.

Created 05 Mar '05
Modified 16 Mar '05

Comments:

I'd suggest taking a hard, hard look at Nagios for monitoring. I'll admit to not using Mon since 2000, but the Nagois (well, it was called Netsaint then) installation that replaced Mon proved itself to be insanely more configurable, more scalable, and more flexible. Mon's probably gotten better in the last 4 years, but then so has Nagios.
# Petro

I concur with the previous poster about Nagios. I've been using it for a long time and it's extremely stable and non-obtrusive. It only sends mail when there really is something going on with the server. I'll send you a URL you can check out for yourself.

As for "holistic" testing for your Web site, I suggest a 5th leg :-) ... performance/load/stress testing. One approach that worked for me was to blast the Web site via some load tool (I've used siege, httperf, and more recently openload), while actively monitoring the servers (Web/DB) via either 1) command line utilities such as top, vmstat, iostat, etc. or 2) SNMP (I configured net-snmp on each server and then queried the servers for all kinds of things like memory, CPU, disk, number of processes etc.). This way you can see where to tweak your servers for better performance. Of course, you can also run your Web app through a profiler while this is going on, to detect bottlenecks in your code; same thing for your DB.

# Grig Gheorghiu

Actually a 6th leg is security testing. You can use a general-purpose security scanner such as nessus, or a HTTP-specific scanner such as N-Stealth, nikto, whisker and many others. In fact, you gave me an idea for a future post :-)
# Grig Gheorghiu

I use the term "operational testing" for the test suite that checks the health of the running app.
# Ken MacLeod