Ian Bicking: the old part of his blog

Reducing boilerplate code in __init__

In Dr. Dobb's Python-URL I came upon this thread proposing a change to Python (either syntax or builtin) to facilitate this general pattern:

class Foo(object):
    def __init__(self, a, b=5, c=None):
        self.a = a
        self.b = b
        self.c = c or {}

There's two proposals, basically. One is syntax:

class Foo(object):

def __init__(self, .a, .b=5, c=None):
self.c = c or {}

By putting (that very little character) . in front of an argument, you imply that it should be assigned to self. The other builtin is:

class Foo(object):
    def __init__(self, a, b=5, c=None):
        adapt_init_args(self, locals())
        # reassignment fixup:
        self.c = c or {}

Using frame hacks, you could easily allow adapt_init_args() to work without arguments. Someone additionally noted a recipe that putting _ in front of the a and b to hint to adapt_init_args which arguments should become instance variables (though that messes up the signature).

There's lots of ways to do this. For instance, I regularly use:

class Foo(object):
    def __init__(self, **kw):
        for name, value in kw.items():
            setattr(self, name, value)

But that can allow typos too easily. So sometimes I set defaults as class variables, and test for hasattr(), and raise an error if an extraneous variable is found. It also doesn't allow for positional arguments. Or mutable defaults, since those defaults end up as class variables, and unwittingly mutating values for all instances will mess you up.

This is one those There's More Than One Way To Do It cases. Though I'm not excited about the syntax proposal, it's addresses a real problem. Clever hacks don't cut it. There's lots of them. They work differently. You can't necessarily recognize them each right away. They each have flaws -- some quite significant.

So far the only hack that seems decent to me (a hack that does not exist to my knowledge) would be a decorator that works with the _ prefixes on variables (using something like **_kw to replicate the setattr thing I show above). This should be a clever decorator because it needs to:

A decorator for this is okay -- better than most of the things people have been proposing -- but not terribly pleasing. It's magical, but a kind of hidden and mysterious magic. It bridges an implementation and an external interface, but if you don't know how the decorator works then you won't understand the disconnect.

I'm not sure what syntax I'd want. I'm not sure if syntax is feasible here -- there's limited room in a function signature. But I think people who say that syntax is always the wrong answer, or that this isn't an issue, are just too set in their ways to see the issue. Python's strength -- particularly compared to the other dynamic languages -- is in the consistency with which people use it. This is how Python is different from Lisp and Ruby (and actually how it is similar to Smalltalk). I think many people (especially the perennial skeptics in comp.lang.python) underappreciate this.

Created 05 Jul '05

Comments:

What I don't like about .syntax is that the comma and the dot overlap visually. OTOH, something like this:

class Foo:
    def bar(self, self.baz, self.xyzzy):
        pass

is what I'd like to see in Python.

# Baczek

Hmmm, this seems to make sense (so much that I actually went and tried it out to see if it would work... which it didn't of course ;-)

# polaar

I took a crack at writing a simple "autoinit" decorator. It's not "complete" in that it doesn't do all the minor tasks like setting the wrapping functions name, doc string, etc. nor does it preserve the function signature of the wrapped function, but it does do everything else. One minor "issue" is that it will assign every keyword argument to the instance, not just declared ones:

from itertools import izip

def autoinit(f):
  # get arg names minus 'self'
  argnames = f.func_code.co_varnames[1:f.func_code.co_argcount]
  # get argument defaults
  argdefs = f.func_defaults
  # build defaults dictionary
  if argdefs:
      defdict = dict(izip(argnames[-len(argdefs):], argdefs))
  else:
      defdict = {}
  def newf(self, *args, **kwargs):
      selfdict  = self.__dict__
      # start with argument defaults
      selfdict.update(defdict)
      # add positional arguments
      selfdict.update(izip(argnames, args))
      # add keyword arguments
      selfdict.update(kwargs)
      # call the original function
      return f(self, *args, **kwargs)
  return newf
# Shahms King

That consistency is both a strength and a weakness. It doesn't have only good points. Couldn't the argument be be made the other way around? That people who think it's very good are just caught up in they're ways to realize the expressivity and power that comes from more freedom?

Actually Lisp has a much smaller core than Python. I imagine you're talking about Common Lisp. But consider Scheme for example.

# Mike

Yes, I was thinking of Common Lisp. Or any of the Lisps that have macros, really. Though really a small core is exactly the thing that leads to TMTOWTDI -- it's based on the idea that it is okay if things are simply possible, regardless of whether they are easy, obvious, readable, or reproduceable; that a small core is a virtue in itself, often ignoring the size of the working set of functionality a programmer must use to actually get anything done. Common Lisp includes all that functionality in duplicate in its core, while Scheme includes none of that functionality. Both end up in the same place -- TMTOWTDI. For example, there are many object systems for Scheme. And there's many ways to bind a variable in Common Lisp. Neither place great value in consistency for practical programs.

And maybe Smalltalk isn't a good counterexample -- as a language it's more like Lisp than Python. But culturally it's more like Python in this regard, with a rich library and set of conventions.

Re: expressivity; in this case, there is a fine edge. If you try to handle every case, you've recreated all of Python in a set of weird rules that can be embedded in a function signature. That's bad. This is something frameworks often get wrong -- they write piles of code to save framework users a couple if statements or a loop, confusing everyone in the process. Certainly __init__, a procedure which can do anything it wants with the object, shouldn't and can't go away. But really, the choice here isn't between expressivity/freedom and TOOWTDI. The status quo has simplicity and transparency on its side, while a change has conciseness and uniformity on its side. This isn't a question of consistency vs. expressivity.

# Ian Bicking

These ideas about monkeying with __init__ are Just Plain Wrong, because __init__ is the source of the problem in the first place. The existence of __init__ is largely a hack to work around the absence of descriptors in earlier versions of Python, so compounding the hack with more hackery is a bad idea.

If you are defining attributes, the sane place to do so is in the class body, not in an __init__ signature. Your "class Foo" is the One Obvious Way to do it, defining attributes using default values in the class, or properties, or other kinds of descriptors. But don't define __init__ methods in your subclasses, just define the attributes. This is the One True Way to solve this problem, because trying to "fix" __init__ will get you nowhere: the very idea of __init__ is what's broken. Pinning wings on the pig won't make it fly.

All of the problems that make you discard this solution are actually quite trivial to fix, compared to trying to "fix" __init__. For example, mutable defaults are easily handled with a descriptor that creates a new mutable value when its __get__ is called and no value is defined. In PEAK, for example, bar = binding.Make(list) defines a bar property that will be a new list object when first accessed, unless you set it to something else via the constructor or some other way. Such descriptors are easily created, and far less magical than stack-tampering decorators.

Similarly, positional arguments are also easy. You should never have more than a few of those anyway, so just write an __init__ that calls the base __init__ with them as keywords. Again, it's much simpler and more straightforward than the alternatives.

I wish that the object base class provided this behavior; it sure would make it the "obvious" way to do things. Unfortunately, I don't think it can be added without breaking code that works now, so it likely won't be until Python 3.0 that it'll appear, assuming that Guido agrees this is the One Obvious Way. (Jython already believes this, and interprets keyword arguments to Java object constructors as a request to set properties on the created object.)

# Phillip J. Eby

I'm not sure exactly what you are proposing or backing; if you want to, say, follow this up with a blog post of your own on the issue, I'm sure that would be interesting.

Anyway, I think what you mean is that the default __init__ should always act like this:

class Foo(object):
    def __init__(self, **kw):
        for name, value in kw.items(): setattr(self, name, value)

And, of course, self.__dict__.update(kw) is a horrible hack that bypasses descriptors and shouldn't be seen as a decent way to implement any of this (I note this because it's often offered up -- an example of one of the Bad Ideas about how to deal with this problem).

Of course, there are many places where you really need clever descriptors to make this work right, if you propose that it replace all of __init__ (at least all the attribute-setup portion of that function -- it's overboard to say that all possible initialization code could be removed).

For an instance of where this becomes complicated, I was just writing some code today:

class Site(object):
    def __init__(self, htdocs, conf_dir=None):
        self.htdocs = htdocs
        self.conf_dir = conf_dir or htdocs

I.e., conf_dir defaults to htdocs, and of course you can't resolve that default in the class definition. So you need a special decorator that forwards .conf_dir to .htdocs when you haven't given an explicit conf_dir. This is slightly different semantics, but in most cases it's equivalent. I'd actually be surprised if PEAK didn't already have some binding option to do this. But some patterns of initialization won't exist. So when that happens you'd have to write something like this:

class forwarding_attr(object):
    def __init__(self, forwarded): # I want a positional argument here,
                                   # so I need __init__ anyway!
        self.forwarded = forwarded
        self.private_attr = forwarded + '_explicit'
    def __get__(self, obj, type=None):
        return getattr(obj, self.private_attr, getattr(obj, self.forwarded))
    def __set__(self, obj, value):
        setattr(obj, self.private_attr, value)

class Site(object):
    htdocs = Attr()
    conf_dir = forwarding_attr('htdocs')

The private_attr thing is dumb, and I wrote about that before -- I think PEAK has an answer to that, but there's no convention for it, and it regularly annoys me.

Anyway, I'll know what forwarding_attr is. And I can describe it easily enough in a docstring or whatever. But it's not transparent; I have a hard time justifying how it is better than self.conf_dir = conf_dir or htdocs. It's not even significantly shorter, though it is declarative and introspectable.

I think this sort of thing would appeal to me if I committed to the style. But I have a hard time committing. I also have a hard time asking other people to commit to this style -- and I am highly reluctant to commit to something that I'm not willing to advocate that everyone use. And everyone has to understand this, because it's not a bit of implementation you can encapsulate -- it's a whole different way of thinking about object instantiation, and effects the language significantly. So, these are the reasons that high-level declarative extensions to Python worry me, even though I understand the motivation, and I can certainly see how it could be useful.

# Ian Bicking

"""This is slightly different semantics, but in most cases it's equivalent. I'd actually be surprised if PEAK didn't already have some binding option to do this."""

Yep. conf_dir = binding.Obtain('htdocs') is the simplest way to do what you described in PEAK; if conf_dir is used without being set, it will take on the value in htdocs at the moment it was looked up (and stays set to it thereafter, unless changed). There's no private_attr because PEAK is smart enough to be able to ensure that the descriptors know their attribute names. If conf_dir is set directly or via the constructor, then it won't try to "obtain" the htdocs attribute.

"""it's overboard to say that all possible initialization code could be removed"""

Maybe. Maybe not. Fewer than 10% of PEAK's classes even have an __init__ method, and most of those are in very "low level" classes. If you look at "component" and "entity" classes only (high-level constructs used to model services or business objects), you'll find almost no __init__ methods at all. (And I only say "almost" so that I won't look silly if somebody happens to find one or two out of PEAK's 2200+ classes.)

"""it's a whole different way of thinking about object instantiation, and effects the language significantly. So, these are the reasons that high-level declarative extensions to Python worry me, even though I understand the motivation, and I can certainly see how it could be useful."""

I can understand that. In a sense, something like this "wants" to be part of the language; you should really be able to do the equivalents of binding.Make, binding.Obtain, and binding.Delegate with syntax or builtins. But, if what you're looking for is the best technical solution to the problem, PEAK's descriptor patterns are the way to go, whether you use PEAK's implementation or not. If you're trying to deal with the educational/political issues, then you should just write the boilerplate __init__ methods and quit trying to "fix" them, because the fixes will just have a different set of educational/political issues, in addition to being a technically inferior solution.

# Phillip J. Eby

>>> class Foo(object):
...     def __init__(self, a, b=10):
...             for name, value in locals().items():
...                     setattr(self, name, value)
...
>>> foo = Foo(1)
>>> foo.__dict__
{'a': 1, 'self': <__main__.Foo object at 0xb7d7396c>, 'b': 10}
# EduardoPadoan

>>> class Foo(object):
...     def __init__(self, a, b=10):
...             for name, value in locals().items():
...                     setattr(self, name, value)
...
>>> foo = Foo(1)
>>> foo.__dict__
{'a': 1, 'self': <__main__.Foo object at 0xb7d7396c>, 'b': 10}
# EduardoPadoan