Ian Bicking: the old part of his blog

Firsttryatgenericfunctions comment 000

Regarding the benchmark, I tried it out this evening and noticed a couple of things. First, the timing figures you quote above show a 23.4X slowdown, not 30X.

Second, I sped up the test by a factor of three by changing one rather simple thing. Ian's rule checking for hasattr(obj,"__json__") is unnecessary; I simply made jsonify_explicit the default and let it raise an error otherwise, since that's what happens anyway. (Making the same change to your version doesn't result in as much of a speedup, but I did it anyway so that it's an apples to apples comparison.)

Also, by doing a bit of profiling, I saw that much of the remaining difference in time comes from the fact that RuleDispatch's equivalent to isinstance() is still in Python, not C. So, I spent another 30 minutes or so whipping up a Pyrex version, and thus sped up the test by another factor of 2.

My final stats, having modified both your version and Ian's version to have the simpler logic, and using the C isinstance code, are 1.58 seconds for your hand-tuned version, vs. 2.88 seconds for the generic function version. (I used time.clock(), which is more accurate for timing purposes, and for both measurements I took the best of 3 runs.) So, the end result is that the generic function version takes about 1.37 times as long to run, which is not bad for stacking up against your hand-optimized code.

But this still isn't a fair comparison, since you proposed to extend the function via monkeypatching. So, I tested a simple version using only one level of monkeypatching:

def jsonify(obj):
    # @@: Should this match all iterables?
    if isinstance(obj, (list, tuple)):
        return jsonify_list(obj)
    return jsonify2(obj)

def jsonify2(obj):
    if isinstance(obj, (int, float, str, unicode)):
        return jsonify_simple(obj)
    elif isinstance(obj, dict):
        return jsonify_dict(obj)
    elif isinstance(obj, sqlobject.SQLObject):
        return jsonify_sqlobject(obj)
    elif isinstance(obj, sqlobject.SQLObject.SelectResultsClass):
        return jsonify_select_results(obj)
    return jsonify_explicit(obj)

This code runs in 2.91 seconds (best of 3), so it's already at breakeven compared to the generic function! So, I then added a Decimal case wrapper to both scripts, and reran the timings. Result: the monkeypatching version takes 3.88 seconds (best of 3), and the generic function version took 2.88 seconds - exactly the same as the case without Decimal. (I did not modify the test to actually include any Decimal values, but of course the monkeypatching approach slowed down anyway.)

Anyway, it appears that as soon as you monkeypatch more than once or twice for something like this, generic functions will start coming out ahead. And if they don't, you can always post the results on a blog somewhere and I'll tune up the implementation so that they do come out ahead. ;)

(The C speedup for isinstance() checks is now in the RuleDispatch SVN version, by the way.)

Comment on Re: Firsttryatgenericfunctions comment 000
by Phillip J. Eby

Comments:

I am glad you have sped up your generic functions. If they can be within 5-10x of the if/else method, generic functions become much more worthwhile imho. At 1.3x the speed they become awesome!

Can you please post the code to your changes so that I can verify them?

btw, my version was not 'hand optimized'. I copy-pasted the rules from the other version. Your version is hand optimized... changing the rules, and adding a C version for some rules. By changing the order of the __json__ changed the behaviour of the code. Possibly breaking it in circumstances(when an sql object sqlobject.SQLObject.SelectResultsClass has a __json__ method).

Or did I get the way the rules work wrong? I assumed they would be tested from first declared to last declared.

Is there a way you can make that particular type of rule faster? That is the hasattr(obj, "__json__") type condition. It would seem to be a very common one for this use with types. I have seen it used elsewhere eg, __str__, __html__, etc. Or maybe that is not why the original generic function was slower.

The interesting thing to note here, is that the slowness of one condition had more effect on the generic function version than the if/else version. Is that right? I can't understand why that condition accounts for such a slowdown with generic functions.

To be clear, my proposed approach is to keep things in one file, with the rules in the one place. Only if absolutely necessary do you override functionality in different files. I think it is better to refactor than to patch with conditions hidden in different files.

I also see the useful things of generic functions for when there are lots of rules spread across lots of files you may not own.

If you can get them as fast or faster than if/else rules, then I can definitely see myself using them for game programming. Especially if you can get it to work with psyco.

ps. Are you aiming to have these in python at some point? Is it not released yet? I can't find the documentation for it on your page. I did find http://peak.telecommunity.com/PyCon05Talk/ though. Pretty good that I could figure it out(I think) without documentation, and only a blog post to go by.

# Rene Dudfield