Ian Bicking: the old part of his blog

Firsttryatgenericfunctions comment 000

Would you mind using restructured text for your comments? Whatever you're doing instead is very hard to read.

You seem to be missing the point, by the way. If a library is trying to provide an extensible function, it's silly to make everyone monkeypatch it and disallow anybody from importing it. And, with the linear reduction in speed caused by monkeypatching, there will sooner or later be a breakeven point where the generic function is faster.

Right now, very little of RuleDispatch is in C, so that breakpoint might take several people adding their own type lookups. But as time goes on and RuleDispatch gets faster, the breakeven point comes sooner as well. Also, with more expensive tests than simple isinstance() checks, the repeated checking will reach the breakeven point faster as well. Import time also won't matter with code that is using dozens of generic functions, not just one.

You also are wrong about not needing generic functions in cases where you can change the source. There are still reasons why you'd use them:

  1. Writing optimal if-then trees for large and complex rule sets is difficult to do correctly
  2. A collection of rules is more likely to have 1-to-1 correspondence with business rules or requirements, and therefore easier to track/verify
  3. An application that gets customized rules for different customers will need a modularly-extensible rule system, that doesn't require all the rules to be in a single hand-tuned tree.

In other words, generic functions are useful for the same reason Python is more useful than assembly language. You can always write the same code in assembly, but it's easier with a high-level specification that's automatically translated. The key is simply that it have performance that's "good enough", and for complex code with expensive tests, generic functions can already do better than a human for optimization, in terms of time to produce a correct and efficient decision tree.

Eventually, they'll beat even your example here, which by the way is a bit like saying, "see, I can write C code that runs faster than Python". Well, duh. Try the test with a dozen extra types added by monkeypatching (not by-hand inlining as shown), and see what the speed difference is. I don't know what the breakeven point for this example is, but I can guarantee there'll be one. Past that breakpoint, monkeypatching will keep getting slower and slower, but the generic function will continue to have basically the same performance. And the more complex your ruleset and tests, the sooner that breakpoint will be reached.

Comment on Re: Firsttryatgenericfunctions comment 000
by Phillip J. Eby

Comments:

'''You seem to be missing the point, by the way. If a library is trying to provide an extensible function, it's silly to make everyone monkeypatch it and disallow anybody from importing it. And, with the linear reduction in speed caused by monkeypatching, there will sooner or later be a breakeven point where the generic function is faster.'''

No. You can use a function from other modules after changing it at runtime.

'''Eventually, they'll beat even your example here, which by the way is a bit like saying, "see, I can write C code that runs faster than Python". Well, duh. Try the test with a dozen extra types added by monkeypatching (not by-hand inlining as shown), and see what the speed difference is.'''

Performance for generic functions seems to be slower until you have 40 different source files for if/else. However it is not 40 different types. If you put the rules in less source files, then it is lots quicker still. From my tests the break even point if they are in the same file, would be about 1000 rules. With psyco, the break even point for if/else would be 120 source files... not that I tested it with 120, just that the performance behaviour is consistent with this when using 40.

So generic functions need to be optimized for those cases where you do not have 40 different source files. Otherwise the claim of better performance can not be made. I can't think of a real life example where people actually have 40 different source files for these types of rules, but I'm sure there are cases. The other place where generic functions are faster is if you make lots of duplicated slow rules.

For simplicity of reading, sit ten random python programmers in front of the two examples, and see which one is understood more easily.

For simplicity of debugging, try debugging your C code compared to some if/else rules written in python.

I think if people understand generic functions, and if there are lots of rules that can not be kept in the same place, then they would be easier to maintain.

Thanks for pointing out why generic functions are useful. It has been interesting playing around with them. I hope I have helped you see some reasons why they are not useful.

# Rene Dudfield

"""Performance for generic functions seems to be slower until you have 40 different source files for if/else."""

Not true - see my test results below. Adding just two monkeypatches to your hand-tuned jsonify brings your benchmark down to being 50% slower than the generic function version.

# Phillip J. Eby

Oops. I mean 35% slower. I was looking at 2.xx vs 3.xx without including the 'xx's in the division. :)

# Phillip J. Eby