Ian Bicking: the old part of his blog

Re: Reducing boilerplate code in __init__

These ideas about monkeying with __init__ are Just Plain Wrong, because __init__ is the source of the problem in the first place. The existence of __init__ is largely a hack to work around the absence of descriptors in earlier versions of Python, so compounding the hack with more hackery is a bad idea.

If you are defining attributes, the sane place to do so is in the class body, not in an __init__ signature. Your "class Foo" is the One Obvious Way to do it, defining attributes using default values in the class, or properties, or other kinds of descriptors. But don't define __init__ methods in your subclasses, just define the attributes. This is the One True Way to solve this problem, because trying to "fix" __init__ will get you nowhere: the very idea of __init__ is what's broken. Pinning wings on the pig won't make it fly.

All of the problems that make you discard this solution are actually quite trivial to fix, compared to trying to "fix" __init__. For example, mutable defaults are easily handled with a descriptor that creates a new mutable value when its __get__ is called and no value is defined. In PEAK, for example, bar = binding.Make(list) defines a bar property that will be a new list object when first accessed, unless you set it to something else via the constructor or some other way. Such descriptors are easily created, and far less magical than stack-tampering decorators.

Similarly, positional arguments are also easy. You should never have more than a few of those anyway, so just write an __init__ that calls the base __init__ with them as keywords. Again, it's much simpler and more straightforward than the alternatives.

I wish that the object base class provided this behavior; it sure would make it the "obvious" way to do things. Unfortunately, I don't think it can be added without breaking code that works now, so it likely won't be until Python 3.0 that it'll appear, assuming that Guido agrees this is the One Obvious Way. (Jython already believes this, and interprets keyword arguments to Java object constructors as a request to set properties on the created object.)

Comment on Reducing boilerplate code in __init__
by Phillip J. Eby

Comments:

I'm not sure exactly what you are proposing or backing; if you want to, say, follow this up with a blog post of your own on the issue, I'm sure that would be interesting.

Anyway, I think what you mean is that the default __init__ should always act like this:

class Foo(object):
    def __init__(self, **kw):
        for name, value in kw.items(): setattr(self, name, value)

And, of course, self.__dict__.update(kw) is a horrible hack that bypasses descriptors and shouldn't be seen as a decent way to implement any of this (I note this because it's often offered up -- an example of one of the Bad Ideas about how to deal with this problem).

Of course, there are many places where you really need clever descriptors to make this work right, if you propose that it replace all of __init__ (at least all the attribute-setup portion of that function -- it's overboard to say that all possible initialization code could be removed).

For an instance of where this becomes complicated, I was just writing some code today:

class Site(object):
    def __init__(self, htdocs, conf_dir=None):
        self.htdocs = htdocs
        self.conf_dir = conf_dir or htdocs

I.e., conf_dir defaults to htdocs, and of course you can't resolve that default in the class definition. So you need a special decorator that forwards .conf_dir to .htdocs when you haven't given an explicit conf_dir. This is slightly different semantics, but in most cases it's equivalent. I'd actually be surprised if PEAK didn't already have some binding option to do this. But some patterns of initialization won't exist. So when that happens you'd have to write something like this:

class forwarding_attr(object):
    def __init__(self, forwarded): # I want a positional argument here,
                                   # so I need __init__ anyway!
        self.forwarded = forwarded
        self.private_attr = forwarded + '_explicit'
    def __get__(self, obj, type=None):
        return getattr(obj, self.private_attr, getattr(obj, self.forwarded))
    def __set__(self, obj, value):
        setattr(obj, self.private_attr, value)

class Site(object):
    htdocs = Attr()
    conf_dir = forwarding_attr('htdocs')

The private_attr thing is dumb, and I wrote about that before -- I think PEAK has an answer to that, but there's no convention for it, and it regularly annoys me.

Anyway, I'll know what forwarding_attr is. And I can describe it easily enough in a docstring or whatever. But it's not transparent; I have a hard time justifying how it is better than self.conf_dir = conf_dir or htdocs. It's not even significantly shorter, though it is declarative and introspectable.

I think this sort of thing would appeal to me if I committed to the style. But I have a hard time committing. I also have a hard time asking other people to commit to this style -- and I am highly reluctant to commit to something that I'm not willing to advocate that everyone use. And everyone has to understand this, because it's not a bit of implementation you can encapsulate -- it's a whole different way of thinking about object instantiation, and effects the language significantly. So, these are the reasons that high-level declarative extensions to Python worry me, even though I understand the motivation, and I can certainly see how it could be useful.

# Ian Bicking

"""This is slightly different semantics, but in most cases it's equivalent. I'd actually be surprised if PEAK didn't already have some binding option to do this."""

Yep. conf_dir = binding.Obtain('htdocs') is the simplest way to do what you described in PEAK; if conf_dir is used without being set, it will take on the value in htdocs at the moment it was looked up (and stays set to it thereafter, unless changed). There's no private_attr because PEAK is smart enough to be able to ensure that the descriptors know their attribute names. If conf_dir is set directly or via the constructor, then it won't try to "obtain" the htdocs attribute.

"""it's overboard to say that all possible initialization code could be removed"""

Maybe. Maybe not. Fewer than 10% of PEAK's classes even have an __init__ method, and most of those are in very "low level" classes. If you look at "component" and "entity" classes only (high-level constructs used to model services or business objects), you'll find almost no __init__ methods at all. (And I only say "almost" so that I won't look silly if somebody happens to find one or two out of PEAK's 2200+ classes.)

"""it's a whole different way of thinking about object instantiation, and effects the language significantly. So, these are the reasons that high-level declarative extensions to Python worry me, even though I understand the motivation, and I can certainly see how it could be useful."""

I can understand that. In a sense, something like this "wants" to be part of the language; you should really be able to do the equivalents of binding.Make, binding.Obtain, and binding.Delegate with syntax or builtins. But, if what you're looking for is the best technical solution to the problem, PEAK's descriptor patterns are the way to go, whether you use PEAK's implementation or not. If you're trying to deal with the educational/political issues, then you should just write the boilerplate __init__ methods and quit trying to "fix" them, because the fixes will just have a different set of educational/political issues, in addition to being a technically inferior solution.

# Phillip J. Eby