Ian Bicking: the old part of his blog

Web application acceptance testing

Grig Gheorghiu wrote up a description of using MaxQ to test web pages in Web app testing with Python part 1: MaxQ. I've recently been looking into web acceptance testing for my job, though my progress has stopped as I've been distracted with other things for a while.

I haven't actually used MaxQ -- mostly because I find it hard to set up Java programs on my development systems. Sun's unfriendly attitude towards Free systems is incredibly stupid and shitty, but I digress...

Anyway, I also came upon MaxQ from Titus Brown, but indirectly as I was looking at PBP, and saw that Titus had also written some code for MaxQ that generated PBP scripts instead of Jython/unittest-based scripts.

After writing some PBP scripts, it did get boring and error-prone quickly. I think the PBP-style of script would be considerably easier to edit than a Jython unittest, but MaxQ's recording feature is a better way to start the testing process. But like I said, I didn't want to install MaxQ...

Anyway, instead I installed a wonderful little Python program TCPWatch -- it's a small HTTP proxy that shows you the exact stuff that goes over the wire (good for learning and debugging lower-level HTTP stuff), and also has an option to record the transactions. It just dumps every request and response in a directory.

This suited me nicely, because I could get an accurate recording without having to worry about the accuracy of my "interpretation" of the recording. The PBP or Jython script is a heuristic of sorts -- an attempt to abstract out the important parts of the transaction, not to simply create an exact recording. So you might see that the user went to /show_items first, then /view?id=1 second. But it's smarter if you figure out that they went to /show_items and then followed the first link labelled view. But that translation isn't correct, it's just best effort -- while tcpwatch is pretty much "correct", in that it doesn't interpret or add any commentary to its recording.

Anyway, I had issues with PBP, and I don't know if it's really the best thing for what I want. While interactive use is useful and important (I like the basic feel), you also have to have good support for rerunning scripts, getting concise and accurate output from those automated scripts, and rerunning or undoing portions of those scripts as you adapt them when the application changes. Now I'm thinking that ipython offers a good compromise between ease of interactive use and translatability to a more automated environment.

TCPWatch could also use a couple more features. The only real problem I found with the core was an inability to proxy HTTPS, but I imagine that's hard (since making that hard is the whole point of HTTPS). There were some other problems -- for instance, I couldn't for the life of me parse gzip responses from Google and other sources -- gzip barfed on them no matter what I did; I think the module is pickier than other gzip implementations. So it would be nice if TCPWatch would correct the request and remove the Accept: gzip header. It would also be nice to extend the Tk frontend so you could annotate the requests (probably just a textbox that would produce a text file that went alongside the rest of the recording). In that case you could paste strings that were important to you (and that the generated test script should check for), or otherwise make notes for use by the script generator. These features would probably be easy to add -- TCPWatch is delightfully small, at just under 1000 lines of code, and my first run at a script generator was just 300 lines, with an abbreviated but reasonable number of features.

I hope to be able to revisit this stack soon as a testing framework: TCPWatch, Mechanize, ipython, and maybe unittest or py.test (or maybe even doctest). But if anyone else finds this idea interesting and wants to kick off an effort, I'm more than happy to jump on board.

Created 25 Feb '05
Modified 16 Mar '05

Comments:

The Zope 3 project use something called "functional doctests". The idea is that you run tcpwatch and use your browser to create a functional test for your Zope 3 application. Then you run a small Python script that converts record files produced by tcpwatch into .txt files that contain embedded doctests, and include that file into your functional test suite.

We started using functional doctests for the SchoolTool/SchoolBell project. Here's an example test.

By the way, I've found it easier to write new tests with vim by copying and modifying existing tests, rather than starting up tcpwatch and doing it with a browser, but I suppose that depends on how geeky you are...

# Marius Gedminas

Cool, that's just the kind of thing I was thinking of. It seemms a tad crude -- what with the raw requests right in the code. It seems like with a little care (plus something based on mechanize's stateful browser object) you could improve the narrative quality nicely. For instance (from your example):

>>> print http("""
... POST /frogpond/resources/lily/edit.html HTTP/1.1
... Authorization: Basic mgr:mgrpw
... Content-Length: 29
... Content-Type: application/x-www-form-urlencoded
...
... field.title=Foo&CANCEL=Cancel
... """)
HTTP/1.1 303 See Other
...
Location: http://localhost/frogpond/resources/lily
...

Might instead be:

>>> b.click('CANCEL')
>>> b.response.status
303 See Other
>>> b.response.headers.location
'http://localhost/frogpond/resources/lily'

Note that b (the browser object) specifically has knowledge of the context and the last request, so in this case you are submitting a form that you just received. I think this statefulness better represents the thing you are testing, plus doctests are sequential and narrative anyway -- might as well take advantage of it. In general I've thought that the Zope 3 doctests I've seen seemed like they didn't have enough scaffolding; there should be more helper functions written solely for the purpose of doctesting.

Anyway, it's still cool that it's useful and works for you. The XML doctest extension I wrote might also be useful for this sort of thing -- though it still needs wildcards.

# Ian Bicking

Due to some code restructuring in SchoolTool/SchoolBell, the example test link is now broken. Here's a working link to the same example test. This time I included a revision number as well, so this link should be permanent.

# Marius Gedminas

I would note that the kind of testing you're referring to (presumably making sure nothing is broken) is not what one might think of as "acceptance testing". To me, after you write the program, acceptance testing is when the user makes sure the program is fast enough, usable enough, and works right to solve the problem, which is larger in scope than automated testing to find bugs.

At my work, we've considered this sort of automated testing for some time, but there is one increasingly important part of web development that they almost completely ignore: JavaScript. What would be nice is a Firefox plugin for acceptance testing that works with the JavaScript interpreter -- almost like an old-school macro recorder. Unfortunately, this brings up the issue of testing between browsers.

Ultimately for my money on normal-size apps, I think there's no true replacement for spending a few hours each release trying everything on your site off a todo list. We do that each with separate browsers and usually end up catching most problems.

-Ken

# Ken Kinder

True... I tend to use "acceptance testing" for everything that isn't "unit testing", but that's probably not very accurate. Functional testing or system testing is probably a better term. Though Grig pointed me to this: http://atomicobject.com/systir.page (a Ruby test system) which is really an acceptance test (specifying requirements at a fairly low but not obscenely low level), and that could fit into this model. It's not a comprehensive acceptance test, but it isn't too far off. Automated acceptance testing is possible -- FIT is an example of that sort of thing.

Selenium or Pamie are options for Javascript-run tests (from Grig's article). John Lee also put Javascript in his to-do for his stack (ClientForm, ClientCookie, Mechanize, etc); I think using Mozilla's interpreter or something. But who knows about that...

Generally I do manual Javascript testing, but that still leaves a place for automated tests. By avoiding dynamic Javascript and making sure your app degrades properly in the absence of Javascript, you can avoid regressions in your Javascript and make the application accessible to an automated testing tool.

Also, by doing this as a script you can avoid sensibility entirely -- you can jump to another page based on any calculation you choose, possibly derived from a regular expression-based parsing or whatever works. You aren't restricted to just following links and filling in forms that you can find on the page.

# Ian Bicking

Have you see webunit? It's not the nicest API, but it works (I've used it at three employers now).

Sure beats raw HTTP requests...

# Richard Jones

I have looked at it before, but my natural laziness with respect to function testing kept me from using it. PBP was the first thing that really felt right. Though now I'm also thinking positively about doctest, and I can imagine webunit and doctest being a good combination (maybe also building on the XML doctest extension I talked about earlier: https://ianbicking.org/xml-and-doctest.html )

I guess webunit and Mechanize have similar scopes, but I haven't really thought about how they compare.

# Ian Bicking