Tuesday, June 11st, 2013

The overuse of functions

A programming quandry (related to some thoughts I’ve had on locality):

The prevailing wisdom says that you should keep your functions small and concise, refactoring and extracting functions as necessary. But this hurts the locality of expectations that I have been thinking about. Consider:

function updateUserStatus(user) {
  if (user.status == "active") {
    $("<li />").appendTo($("#userlist")
      .text(user.name)
      .attr("id", "user-" + user.id);
  } else {
    $("#user-" + user.id).remove();
  }
}

Code like this is generally considered to be terrible – there’s logic for users and their status, mixed in with a bunch of very specific UI-related code. (Which is all tied to a DOM state that is defined somewhere else entirely — but I digress.)

So a typical refactoring would be:

function updateUserStatus() {
  if (user.status == "active") {
    displayUserInList(user);
  } else {
    removeUserFromList(user);
  }
}

With the obvious definition of displayUserInList() and removeUserFromList(). But the first approach had certain invariants that the second does not. Assuming you don’t mess with the UI/DOM directly, and assuming that updateUserStatus() is called when it needs to be called, the user will be in the list or not based strictly on the value of user.status. After refactoring there are functions that could be called in other contexts (e.g., displayUserInList()). You can look at the code and see that particular things happen when updateUserStatus() is called, but it’s not as easy to determine what is going to happen when inspect the code from the bottom up. For instance, you want to understand why things end up in <ul id="userlist"></ul> — you search for #userlist but you now get two functions instead of one, and to understand the logic you have to trace that back to the calling function, and you have to wonder if now or in the future anyone else will call those functions.

The advantage of the first function is that blocks of code are strict. You execute from the top to the bottom, with clear control structures. When GOTO existed you couldn’t reason so well, but we’ve gotten rid of that! (Of course there are still other exceptions.) It’s not entirely clear what intention drives the refactoring (besides adherence to conventional standards of code beauty), but it’s probably more about code organization than about making the control flow more flexible. Extracting those functions means that you now have the power to make the UI inconsistent with the model, and that hardly seems like a feature.

And I have to wonder: are some of these basic patterns of “good” code there because we have poor tools for code organization? We express too many things with functions and methods and classes (and perhaps modules) because that’s all we have. But those are full of unintended semantic meaning.

Anyone have examples of languages that have found novel ways of keeping code organized?

(Also there are comments on G+)

Comments

Peter BengtssonTue, 11 Jun 2013

I agree. I'd sum it up as, let it be a full-fat function until something within is needed too by another full-fat function.

Ionel Cristian MărieșTue, 11 Jun 2013

For such small amount of code you wouldn't bother but suppose it's 10 times longer ? I think making better abstractions for the ui would do it. But be generic, not specific - eg: implement a widget for a list of labels instead of a UpdateUserStatusWidget

buggyfunbunnyWed, 12 Jun 2013

The result of this (not ready for prime time) OO meme (for the real deal read Meyer or Holub) is that one ends up with lots and lots o state in instance variables (by various names in various languages). Keeping track of all that in one's head is tough. Those with photographic memories probably do OK. The functional languages exist in contraindication, but have a performance burden.

When we asked our high school English teacher how long a paper was supposed to be, he said, "organic length". Too bad for us, since we hadn't a clue.

The result is that most OO style code ends up being written in the debugger, since that's the only way to keep track of the spaghetti!!

asbjornengeThu, 13 Jun 2013

I've seen this on many an occasion. A few years back I inherited quite a large java (liferay) application where the previous engineer had really taken this to heart. Every class contained a large number of 1 line functions with increasingly complicated naming. Trying to stitch together what was actually going on when a request hit a particular endpoint was a nightmare.

In our current object oriented languages and frameworks I think it is about finding a middle ground with what structurally makes sense with the concepts of the application and what is practical to maintain.

I'd also be interested to hear and learn about other language approaches and their (hopefully better) ways of organising logic.

asbjornengeWed, 10 Jul 2013

Would I be wrong to argue that this makes for a particular mess regarding procedural functions? I've recently started digging into functional programming (avoiding state, thinking in values, designing pure and chainable functions) and find it helps reduce complexity and ease control flow so it actually makes sense keeping procedures in single functions. Did that make sense? Anyway, Rich Hickey has more - http://www.youtube.com/watc...

collinmandersonThu, 11 Jul 2013

Wow. I agree. I don't enjoy needing to skip through different functions to figure out what's going on. I find huge functions really easy to debug and follow.

interstarSat, 03 Aug 2013

Very good question. My immediate thought :

If your refactored code was like this :

function updateUserStatus() {
var id = "user-"+user.id;
if (user.status == "active") {
addToList("#userlist",user.name,id);
} else {
removeFromList("#userlist",id);
}
}

it would solve most of the problems. You aren't fussily creating extra functions for tiny fragments of functionality which are only relevant to narrow situations (ie. users, userlists). Now your new functions are more generic and widely applicable. They're doing enough that it's worth the overhead of creating them. They're still usefully hiding the bit of complexity you DON'T want to think about here - the actual jQuery / HTML details of how lists are constructed - while keeping the important details - WHICH list you're updating, what bits of a user you show - in this locality rather than allowing them to become diffuse across your program.

Of course, you CAN'T prevent another bit of code updating the list itself somewhere. (That's more a quirk of the HTML environment you're working in where the DOM is global. In many analogous cases you could prevent most of the code having unauthorized access to a list simply by making it private within a class.)

This is a great question though. It's still a classic modularity problem : how to keep closely related stuff together in one place, and unconnected stuff at arms length / in a substitutable form. But we tend not to have much explicit thinking down at such small granularities (compared to all the patterns for classes etc.)

This is the personal site of Ian Bicking. The opinions expressed here are my own.