Archive for the 'Imperative' Category

Lorem Ipsum

Lorem Ipsum is a sample phrase used as a filler in typesetting, to illustrate how some text would look. Here’s a sample paragraph:

Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

This approach to filler text is superior in many ways to most alternatives:

  • It’s long, certainly longer than “Test” or “AAA”, so it can fill several lines (and test whether there is an unexpected length limit when saving the data).
  • Unlike random strings of characters copy-pasted several times, it is split into words of uneven length, spaces between words do not align horizontally or vertically.
  • It is readily recognizable as random text by any typesetter (or developer) worth its salt.

A typical testing strategy, when filling forms by hand, is to copy-paste one or two Lorem Ipsum paragraph to test such things as how the text area reacts, whether it is saved correctly, and so on.

Lorem Ipsum does have some limitations:

  • It’s written in latin, so it fits nicely in the ASCII range of characters. As such, it does not test for Unicode support.
  • It contains no quotes of any kind, so no testing of database escaping processing either.
  • It contains no HTML-specific characters like < or &, so HTML character escaping is not tested either.
  • For that matter, it does not contain exceedingly long words that would overflow a single line, so you cannot test for this kind of overflow either.
  • Sometimes, you want to auto-linkify links and URLs.
  • Sometimes, Skype turns numbers into … clickable numbers.

I need to test these things on my web applications, so I’ve developed my own version of a “Modern Lorem Ipsum“:

Lorem <a href=”javascript:document.write(”)”>ipsum</a> dòlor sit àmet, consectetur adipisicing élit, sèd do eiusmod tempor incididunt ut labore & dolore magna aliqua. <hr/>Ut enim@minim.com veniam, quis nostrud exercitation `”ullamco laboris nisi & aliquip ex æ commodo consequat. Duis aute irure dolor 01 23 45 67 89 in reprehenderit in voluptate velit esse cillum dolore `eu fugiat https://nulla.biz/pariatur. aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

Excepteur [u]sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit ‘anim id est laborum.[u] Lorem” ipsum dòlor sit àmet, consectetur adipisicing élit, sèd do eiusmod tempor http://incididunt.ut.com/labore & dolore magna aliqua. <b>Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex æ commodo consequat.</b> “Duis aute irure dolor in ‘reprehenderit in voluptate velit — esse cillum dolore eu fugiat nulla pariatur.

Feel free to copy-paste it away. Wordpress certainly does seem to have a hard time with these long lines in post bodies—I wonder if it happens in comments as well :)

How to build a page client-side?

The basic philosophy of jQuery is to start with some existing HTML sent over vanilla HTTP by the server. That HTML should be all you need (so that people without a JavaScript-enabled browser can still use the web site). Then, jQuery enhances that HTML by adding new behavior (usually changing the properties of existing elements, sometimes adding new elements).

This is very useful for small pieces of behavior, but writing complete and complex components is hard for several reasons:

  • A partial view strategy is required on the server side to insert the appropriate HTML in the appropriate location (as opposed to leaving an empty hole and having the component generate its own HTML).
  • If the behavior of your component is complex, then there will be a lot of parsing going on. A typical example would be sorting a table by a “date” column—since the date format in itself cannot be parsed (culture-dependent and may contain “Yesterday”, “13 seconds ago” and similar shortcuts).
  • Sometimes, the server needs to add information that is not visible, but is needed by the JavaScript. The format for sending this data (attribute, hidden field…) is difficult to document and type-check.
  • Selecting precisely the right fields in a blob of HTML, without hitting any others, is hard, especially for components that may later contain sub-components. Class-based selection is slow, id-based selection involves heavy logistics to move the identifiers around, and complete traversal takes a while and breaks if the HTML changes.

My preferred approach to JavaScript components is to receive JSON-formatted data from the server (easy to parse) from which I construct the DOM elements I need and capture them at the same time.

var $comment = $('<div><img/><span/><div/></div>')
  .addClass("comment");

var obj =
{
  $self : $comment,

  $img  : $comment.children('img')
          .attr('src',data.imgUrl),

  $name : $comment.children('span')
          .text(data.authorName)
          .addClass('authorName'),

  $body : $comment.children('div')
};

$.each(data.text,function(k,t){
  $('<p/>').text(t).appendTo(obj.$body);
});

return obj;

The point is that you then have access, through the returned object, to all the relevant elements within the comment, so that you may target them with effects without any risky selector-based magic. Besides, if the HTML format of comments changes, you will only have to change the code above and nothing else.

And of course, using text() escapes any dangerous HTML you might have.

To make the above appear in your code, all you have to do is:

var $commentsList = $('#my-comments-list');

$.each (comments, function(i,c){
  var obj = $comments[i] = renderComment(c);
  obj.$self.appendTo($commentsList);
});

This is usually where you hit a performance wall, because this is one of the slowest ways of using jQuery on a web page.

I’ve been in this situation recently on a smallish website that basically displays a list of contacts invited to various events as a 10-column/300-row table that includes additional functionality such as:

  • Dynamically add or remove new rows (with server-side confirms)
  • Rows are grouped together, and groups can be collapsed and expanded
  • Clicking on rows opens a modal editor, modifications are propagated back to the table
  • The data and formatting for certain rows depend on some other rows

The initial approach was exactly as described above: every cell was constructed as $('<td/>'), classes and attributes were applied to it, then all cells were inserted into rows constructed as $('<tr/>'), and these in turn were appended to the table tbody. Since some parts of the table were clickable to achieve various effects, jQuery’s click() function was used to add the appropriate event handlers, and the event handlers were closures that contained all relevant information about what row had to be collapsed or what element had to be removed.

The average time for rendering all of this was a solid 2200ms on Firefox 3.5, which felt about as dynamic as a dead tortoise nailed to a slab of concrete. For comparison purposes, rendering the data server-side and sending it to the client took about 390ms on average (arguably, the server would have scaling issues as it would have to render the HTML for all clients, but still).

2200ms means about 7ms per row. The problem here isn’t that the jQuery code is slow, but rather that it’s executed so many times to add up to a pretty large number.

My first attempt to improve performance was to avoid constructing rows cell by cell, instead building the final HTML of the row in one shot and then selecting clickable elements inside the row through their class to apply event handlers. Rows were then inserted into the table body using jQuery’s DOM functions. The new rendering time was 1800ms, which was not as good as I hoped my improvement to be.

The second step was to move away from selecting clickable elements to apply event handlers. This meant that I could either insert the event handler code in the HTML (but this meant no closures, so I would have to rely on global, non-garbage-collected behavior) or add a click event to the entire table and determine what element had been clicked (and parsing the DOM for information about what to do with the click, which was annoying).

I went with the first way, rewriting my code as global handlers and eliminating all the select-child-with-class overhead. Rows were still constructed independently and inserted independently. The improvement was sensible, as the rendering time was then 980ms.

The last wave of optimizations consisted in making sure the HTML for the entire table body was generated in one shot and concatenated as an array (using [a,b,c].join('') instead of a+b+c). This creates 5223-element array, concatenated into a string containing 72357 characters, which is then inserted into the table body using jQuery’s html() function. The entire process, including preliminary processing of the data to be displayed, takes about 160m (a 13.7× performance increase).

The change was mostly moving from this design pattern:

function renderRow(data)
{
  $tr = $('<tr/>');

  $('<td/>')
    .addClass('name')
    .append($('<a/>')
      .text(data.name)
      .click(function(){ frobnicate(data.id); }))
    .appendTo($tr);

  // ...

  return $tr;
}

To this one:

function renderRow(data,html)
{
  html.push(
    '<tr><td>',
    '<a href="javascript:frobnicate(',
    data.id,
    ')">',
    esc(data.name),
    '</a></td>',
    // ...
    '</tr>'
  );
}

Again, this is an extreme situation where page-generation goes way out of hand because a lot of rows are generate—the net benefit, as far as rendering a single row is concerned, is around 6ms. If your page contains only a small number of complex components, you can ignore the performance issues to get the components done, and only optimize if it turns out to be noticeable.

Shouldn’t Happen…

Design and development is turning the great unknown chaos into tiny bits of controlled functionality with promises about what the result will be, and expectations about what the input should be.

There is an interesting duality between two categories of expectations, depending on whether they are the responsibility of the user, or of the programmer.

User errors are classic mistakes involving incorrect input, such as attempting to load a file that does not have the right format, or visiting a web site that does not exist, or entering an incorrect email address. A program is expected to, at the very least, gracefully handle these situations (because nobody likes errors) and the best programs are actively designed to reduce the possibility of error though appropriate user interface choices.

Programmer errors are the most frequent ones, but most of there are luckily caught by a compiler (or, in the case of the less lucky interpreted languages, the parser). The basic idea is that if you expect a function parameter to be an integer, and you tell your compiler, then static analysis will determine that you will receive a string argument, and the universe will collapse build will fail.

Static Analysis

Static analysis can be very smart. It can prove beyond any doubts complex properties about complex software written in obscenely low-level code (such as C with inline assembly). The problem if that working with a static analysis tool can add unusual constraints on the developers themselves: the halting problem dictates that no tool can safely predict the behavior of a program, so any given tool will either have false negatives (undetected bugs) or false positives (safe code reported as dangerous) and the general trend for static analysis tools is to avoid any false negatives at the cost of false positives.

The quality of a static analysis tool is determined by how hard it is to write code without false positives (usually done by manually coding around the blind spots of the tool).

Static analysis tools have two problems. One, they’re not available for every single language and platform out there. Some of use are still using languages with eval(), throwing Java exception-safety out the window because we find it too constraining, doing without those pesky type systems and generally making a childish fuss about those “warning” thingies. Two, static analysis tools can only check constraints that are described by the developer in some form, such as assertions, preconditions, postconditions, type annotations or some other kind of attribute added to the code.

So, if you forget to “assert” it, nobody is going to check it for you. For instance, no tool is going to warn you that you unwittingly leak a credit card number to a third party.

The Elephant Statue

In a sense, predicting user errors is the mirror activity of gathering specifications. Both force you to think about all possible situations your software will face, and decide what should happen: maybe you have to display an error, maybe you will have to tread the input in a clever but predictable way, or maybe you will have to rework your process to prevent that situation from happening.

This is akin to creating an elephant statue by starting with a block of stone and carving out everything but the elephant. Deciding what your users can do implicitly defines what your users cannot do. Depending on the situation, you may guide your design with either approach.

Can references be null?

I often hear people insist that C++ references can be null.

You can easily create a null reference like this:

int *ptr = 0;
int &ref = *ptr;
// ref is a null reference

First, there is no such thing as a null reference in standard C++.

Like all programming languages, C++ is purely a construction of the human mind, and the concepts of the language have been given arbitrary but meaningful names as part of the standardization process so that the users of the language could understand each other. These could be existing words such as class, aligned, statement, pointer or new ones like rvalue or cv-qualified.

In particular, the C++ standard does define a null pointer, but it does not define a null reference because that concept is not part of the language.

This means that instead of having a standard definition (like a null pointer), null references have a commonly accepted definition that follows a similar schema: a null reference is a reference r such that &r is a null pointer. For the rest of this article, I will follow this definition.

Second, a null reference cannot exist in a well-behaved program.

Since the concept does not appear in the standard, it’s pretty obvious that no place in the standard explicitly mentions an operation resulting in a null reference being constructed. In fact, the description of how references are constructed explains that references are constructed from other values, and by definition values cannot have a null pointer for an address.

In particular, the typical construction of a null reference by dereferencing a null pointer is explicitly mentioned in the standard, and described as undefined behavior.

In short, while many modern implementations will let you create a null reference, this will necessarily involve some form of undefined behavior along the way.

I tend to evaluate programmers on a “would I hire this person?” basis. So, what does the “C++ references can be null” statement tell me about someone ?

  • They might be confusing C++ references with C# or Java references, which can be null. Certainly a NO HIRE for a senior C++ programmer.
  • They might have misread their textbook and genuinely think references can be null in the same way as pointers (except, well, the lack of syntactic sugar for checking if a reference is null or creating a null reference strikes me as odd if that were the case). Again, NO HIRE for a senior C++ programmer.
  • They know it’s not standard, but they do not care about writing standard programs. This is an immediate NO HIRE even for junior C++ programmer.

Then, there are those who know about it, and would not use it because they care about writing standard programs. The convention when discussing C++ is to assume that the program is well-behaved, so most people would parse “C++ references can be null” as “C++ references can be null in a well-behaved program”, the latter being obviously incorrect. I don’t really mind a small misunderstanding, the subject being as complex as it is, as long as it’s corrected quickly.

A person who does not is either someone without much experience discussing C++ with others, or a troll. A definite NO HIRE on a team.

Heterogeneity

John is a fairly adept PHP developer. He is familiar with object-oriented features from PHP 5, has experimented with some PHP 6 features, and is quite skilled at bending the Zend or Symphony frameworks to his will.

But John is not really an SQL expert—sure, he might have written some simple queries and he can fight his way around a normalized database, but he’d rather use a mapping layer on the PHP side. He is no fan of JavaScript either, although he can sometimes hack together a quick solution based on his limited knowledge and online tutorials. And John is in trouble, because web development is ultimately a heterogeneous environment where you have to know three languages to get things going.

There have been many efforts to help out programmers like John by eliminating as many languages as possible from the process. Database mapping tools provide a protective layer that shields SQL-averse programmers from the unfathomable Lovecraftian horror of INNER JOIN. Ready-made components encapsulate clever JavaScript so that server-side developers don’t have to muck in the demeaning task of keeping browsers in line.

I’ve had the pleasure of working on both sides of the fence. Some of my projects were beautifully streamlined 98% PHP – 1% JS – 1% SQL works of art where the various pieces of non-PHP code were carefully hidden away from the prying eyes and trembling hands of PHP developers. Others had a complete architecture designed for each of the three languages, with team members that specialized in certain areas only, and strong conventions on how data had to cross the borders. These were not toy projects, but rather large websites that had to support the brunt of thousands of visits.

The bottom line is that when you’re running a website with the intent to get money out of it, you want as many daily hits as possible, and so the software must be able to handle all of them smoothly. If you are writing your own web software, the burden of optimizing that software is yours as well. This involves identifying bottlenecks and reimplementing them to do less work, so that you will eventually need:

  1. Developers that are familiar enough with the software and any third party elements involved.
  2. Profiling tools that help identify what parts of the software take the most time.
  3. A software model that is flexible enough to allow reimplementing critical pieces.

It is generally observed that [weasel words] the layers of PHP/C#/Java code stacked to hide away the SQL/JS/CSS/HTML underneath will decrease the performance of the software, because databases are queried with SQL and web pages are presented in JS/CSS/HTML regardless of what one-language programmers would like to believe, so the layers end up generating that code themselves, often with hilarious results.

A classic example would be server-side code for displaying a list of objects (displayed here as PHP):

$user_id = Controller::getCurrentUser();
$user    = UserFactory::getById($user_id);
$friends = $user -> getFriendsList();

foreach ($friends as $friend_id) {
  $friend = UserFactory::getById($friend_id);
  View::renderUser($friend);
}

This is an actual excerpt from a piece of code I wrote, with only slight rewording of certain components. A naive implementation would result in a first query reading from the database the data for the current user (with a list of 200 friends), then 200 more queries reading the individual users from the friend list. This results in a slow-loading page, a dead database and an unhappy customer (believe me, I’ve tried). The PHP-only programmer answers with a blank stare, because the code is properly written and well-encapsulated.

Now, here’s the million dollar question: can your mapping layer be configured so that the above code can get all the data in one, two or three queries?

The project I that code is coming from relied on Zend_Db for database work, which could hardly be called anything but naive. The optimization approach was to place a caching layer between the user factory and the database, and configure that layer with rules such as “if the developer calls getFriendsList, the next time UserFactory::getById is called, precache the data for all the users returned in the list of friends”. This meant that only two queries were made, which happened to save the day on that particular project.

Still, my point is not whether your favourite ORM can achieve the same performance as hand-written SQL code. Some of them certainly can.

My point is that to write software that has database interaction as a bottleneck, you need programmers that understand the database interaction layer thoroughly. Whether that layer is a PHP/C#/Java ORM or plain old SQL requests is irrelevant—without knowledge of how data is pulled from the database, there will be no way to prevent or eliminate bottlenecks reliably.

The ORM system Foo can eliminate the need for SQL experts, but it creates the need for Foo experts instead. What is important, then, is whether it’s easier to find Foo experts or SQL experts.

AJAX is Hard

Seen from the outside, AJAX has become an easy technology:

$('#container').load('http://domain/path/to/page');

Even if you’re doing smarter things, like updating server-side values with asynchronous POST requests, it’s still easy:

$.post(
  'http://domain/path/to/action',
  { user_id       : $('#user').val(),
    new_user_name : $('#name').val() }
);

And, of course, it’s also easy to make mistakes in AJAX.

Not taking errors into account

In an ideal world, the AJAX request is sent to the server, completed successfully, and the response is propagated back and applied.

In the real world, the AJAX request might never reach the server because the network cable was pulled, or it could carry stale data that cannot be processed, or the user session might have expired, or something else altogether.

This “request cannot be completed successfully” issue has been solved for years in the traditional HTTP world by both servers and browsers: when you try to get to a page and that page can’t be reached, you will either get an error message from your browser or be redirected to another page by the server.

In the AJAX world, a failed request times out silently without anything happening. You have to actually implement that small “Your session has expired, click here to log in again” message box yourself, just like so many other websites did. And, of course, you need to take into account into all of your workflows that the user may be logged out of their session at any point.

Don’t forget to include cable-plugging as part of your testing protocols!

Forgetting to refresh parts

When you post some modifications to the server asynchronously, you need to refresh some parts according to the new state of the server. Which parts do you refresh?

While the answer might seen easy in every single specific case (I’m updating this list/object/grid, so I’ll just refresh it), the general answer is not so simple: your server-side modification might have an impact on other parts of your system.

Consider a typical Facebook-like interface: you have a menu with an inbox, and to the right of that inbox there’s the number of unread messages. On the inbox page, you have a list of messages with a little cross on each message that deletes it through AJAX. The naive thing to do is have that cross update the list of messages, but then deleting an unread message wouldn’t update the menu.

Inevitably, a developer working on an AJAX feature will forget to take into account that some other part of the page that needs to be updated. Or a developer will add some information to every page and forget that some pages need to update that information.

Repeating yourself

Javascript does not benefit from the same clean separation of features into classes, files, packages and namespaces. Also, IDE quality is lacking when compared to other languages. This makes it hard to refactor JS code when duplicate functionality starts to appear.

Let’s consider the asynchronous post situation. In order for that code to work, you need to have fields with identifiers ‘user’ and ‘name’ and some element to initiate the post through an event. This is not encapsulated: if another page needs similar “post user name” functionality, the code will have to be rewritten. In fact, when the code is that small, it’s actually faster to rewrite it than it is to find and call an existing function (not to mention writing that function in the first place).

No refactoring means the code repeats itself. Having two user-name-change pieces of javascript on two different pages means twice as much work to do when you eventually change how that part really works.

Allowing complex behavior to be written in two or three lines is no excuse for letting your code get out of hand: stand firm by the “once, twice, refactor” motto and do not hesitate to turn a three-liner into a ten-line reusable function with appropriate documentation.

You’re not a person

WEEK 1

In this application, every person belongs to exactly one team.

WEEK 4

We need to manage external contractors. We could use the “person” object.

WEEK 5

Hey, we need to assign a team to every person. Let’s create an “external” team.

WEEK 127

Did you see that newspaper article about our company? They say we have an average of 30 people on every team. Do we even have 30-people teams?

Names are short. They can only convey a very limited amount of information. Even worse, that information tends to be different from its meaning in standard English: by declaring in week one that every person belongs to a team, the project designers separated the Application::Person (always in a team) from the English::Person (might be in zero, one or more teams). By week four, this separation vanished from the minds of most of the team. A developer noticed that “English::Contractor is-a English::Person” and mistakenly translated it to “Application::Contractor is-a Application::Person“.

This was the first mistake. Why didn’t he notice?

A positive property is what you can do with a thing.With the Person object, you can store a name, login, password and phone number!This is exactly we you needed! Those positive properties you need that the object doesn’t provide, you can always add them through inheritance or composition, and that’s still less work than implementing everything or having to refactor the code. A negative property is what you cannot do with an object no matter how hard you try. With the Person object, you cannot remain on your own without a team! But our brains are biased to look for positive properties first, and passively ignore negative properties until it’s too late. Positive properties are about the solution solving the problem. Negative properties are about the solution not being applicable.

The second mistake was, by far, the worst. So they finally noticed that negative property that blasted all their model away. And they went on with it, patching the issue by altering the meaning of Application::Team. It originally a project team within the company, it then represented a named group of people that could be a project team or the group of external contractors. This is refactoring: no matter how you look at it, you change the behavior of an object and let it propagate throughout the project, so you better be careful about where it propagates! In this case, they weren’t careful about propagating the change of meaning to the documentation and user interaction part of the project, who mistakenly kept the old meaning of Application::Team. This led to a naive PR team issuing a statement that included the “external” group as if it were a project team.

It’s always helpful to have an anal-retentive person in a group, preferably in a position of authority that lets them veto such changes, and who is vigilant enough to spot that “external” team early on in the design.

The real mistake was allowing a negative property to slip into the design. Negative properties hinder reuse, by definition. Sure, allowing a person to belong to zero-one-many teams is hard on every piece of code that must work on teams, because the writers have to remember to check whether the person has a team in the first place. But it has to be done. Doing it may even bring to light some issues in the original requirements (”So what happens when a person changes teams between the moment team bonuses are computed and the moment they are paid out?”) that would become annoying later on.

Best Practices

There are hundreds of things that can go wrong even in the simplest situations. I’ve already explained why the real value of a domain expert is precisely to identify in advance everything that could go wrong with a project, so that it can be avoided.

Consider a comment form on a website. Nothing too fancy: the user fills in the “Name”, “Website (optional)” and “Comment” areas on a form, clicks the “Submit” button, and the page reloads with the comment on the page. No login required, no AJAX, no special effects. There are many things that can go wrong with this setup, and will go wrong if left in the hands of an inexperienced developer. They can be inconvenient, annoying or outright dangerous.

For example,

  • Double-posting. When the submit button is clicked, the form sends a request to the server with the comment to be added. The server responds with the new list of comments. The user clicks the “refresh” button while on that page, or navigates to another page and presses the “back” button. This cause the browser to send the request again, so the comment appears twice in the comment list. If using POST, this is slightly less dangerous : the user might get an annoying “Submit again?” window instead of double-posting.
  • SQL Injection. It is highly probable that the comments will be stored in an SQL-accessible database. If the code constructing the SQL query is not properly written, an appropriately chosen value for the comment fields can result in nasty things happening to the database.
  • Cross-Site Request Forgery. Suppose that posting the form creates a GET request like:
    http://yourdomain/postcomment?name={name}&text={text}

    Knowing this, I can include an image tag in a forum, with a source attribute that matches the posting of a spam comment on your website. Every visitor of that forum page will send that request automatically (browsers auto-fetch images by default) and spam your comment list.

  • Script Injection. The text entered by users must be displayed back to the visitors. If that text is not escaped before being output, an malicious attacker can submit a comment containing a dangerous script like:
    document.location = "http://www.youtube.com/watch?v=f2b1D5w82yU";
  • Encoding Issues. What happens if the page is encoded in UTF-8 but I send you ISO-8859-1 text? Conversely, what happens if the page is encoded in ISO-8859-1 and I copy-paste my comment from Microsoft Word? For that matter, what is the encoding of the database? What is the encoding of your string literals?
  • No Validation. User forgets to enter a name or a comment. No server-side check is made to determine whether the posted comment is valid and you get a mix of ugly empty comments and/or server error messages.
  • Lossy Validation. You have to prevent people from posting with no name or no comment body. This means errors will be displayed on the page and, if the detection of such errors happens on the server after the initial post, it’s easy to forget displaying back the text the user entered in the first place. “Sorry, you forgot to enter a name so I’ve thrown your ten-line comment away” [#]
  • Does not work in Internet Explorer. There are many possible causes for it, such as respecting W3C specifications.
  • Legal Issues. If a malicious commenter uses your page as a soapbox for illegal activities, some countries will hold you responsible. For instance, in France, you can be condemned if anonymous posters engage in holocaust denial on your website.

That’s nine, just thinking about the obvious problems that would happen if following the simplest approach to this, and I have seen many of them happen in three situations: novice programmers (such as interns), freelancers and low-wage programmers. The worst offender is by far the code written in naive PHP, which has the peculiarity of “the simplest thing” being almost always “the incorrect thing” as well.

Still, if you can’t let an intern write a simple user comments page, what are you going to let interns do?

All of the above issues are easy to correct once you know about them. Always send data as POST, check the referrer, convert everything to UTF-8, validate your data, use prepared statements instead of inline SQL, respond with a 303 redirect to a GET page, include the posted data and any errors in the session and display them back in the form if present, take all your dynamic generation text through an an HTML escaping function, add “type=submit” to buttons, and add a quick moderation tool to hide unwanted messages quickly.

Knowing about the issues and acting to prevent them is the hard part, which is why every project should have at least one experienced developer who knows about the errors. Or be using a framework that prevents such errors from happening in the first place (then again, if the documentation for Zend_Form has an “user refreshes page, double-posts by mistake” error, who can we trust?)

Although it has been taken over by marketing folks, there are still good thinks to be said about “best practices”. The basic idea is to have a set of practices available for the less experienced developers to follow. Such practices are usually very simple to understand and follow (never display data in a POST controller, never change the model significantly in a GET controller), reasonably simple to verify automatically (assert that no output happened as part of a POST controller response) and have the immediate effect of preventing a classic mistake (no re-post on a page refresh).

I’m a big proponent of enforcing good code through practices first, and then code-based contraptions if developers insist on ignoring them. The problem with going for the contraptions first is you have to explain how to use the contraptions anyway, and people will be tempted to move around the contraptions and still write bad code.

If your code is reviewed by a compiler or an automatic code analysis tool, you can learn how to game the system. This results in code that does not trigger the alarms, while still being bad. Compare with having your code reviewed by a live person, who is experienced and anal-retentive about respecting practices and makes it horribly clear that if you don’t follow them, you will be forced to follow them, on your free time before you can commit your code. Such reviews leave no room for wiggling, and as long as the judgment of the reviewer is fair, will actually motivate the team to respect the standards.


[#] Viadeo actually did even worse things to me (”Sorry, I forgot to tell you that you were only allowed 255 characters in this box, so I’ve deleted everything for you so you can try again. Oh, and don’t try the back button of your browser, I have also deleted your input on the previous page.“) so I suspect it has been written by Java rookies with close oversight by non-technical management.

Team Naming

Names. We programmers see more names in a single session than a phone directory editor will see in their entire career, yet we prove worse at finding names than a fift

Naming is a two-way approach: the name must accurately convey what the thing is, and the name should be easily guessed for that thing. The two sides of the equation are not always of equal importance: guessing the name of a local variable is less useful than guessing the name of a class in a library.

Humans always use context to understand what names mean, in order to disambiguate the many possible meanings of a name. For instance, ‘window’ could refer to the ubuquitous user interface concept or it could refer to the glass-paned house building block. A sentence like “I open a window” needs a minimum level of context to disambiguate between the two interpretations.

On the other hand, the information must not be made redundant either. For instance, a class named “OpponentTimer” defined within a “Opponent” namespace: it’s fairly obvious that the timer is related to an opponent both within the namespace (you’re dealing with opponents, so the timer should have something to do with it) and outside the namespace (as it’s being referred to as “Opponent.Timer” or something like that). The same goes with file paths, such as ‘/scripts/invaderScript.py’ which could have been named just as well ‘/scripts/invader.py” with no loss of information due to the context.

This is what I used to think about this issue :

One thing I have noticed time and time again is that the vast majority of people I work with (or see on the internet, for that matter) are very bad at finding names. So bad, in fact, that I can usually propose better names within seconds of reading them for the first time. At least they agree that the new names are better.

The reason is, in retrospect, quite obvious : two brains are better than one, especially when it comes to looking at things in different contexts to determine if there are any ambiguities. These programmers must have been thinking the same thing when looking at my code.

By now, you should have noticed that “team naming” refers to “working as a team to name things” as opposed to “naming a team”—lack of context does tend to create such misunderstandings :)

So, that would be why pair programming with at least one -ansi -pendantic -Wall programmer in the team tends to create code that is much cleaner than one-programmer code written by either participant.

Short of acquiring some sort of split personality, there’s no easy way to achieve that alone : no matter how hard you try, your brain can only hold one context at a time. Some programmers might be able to switch contexts faster than others when they think about it, but you generally don’t switch contexts when naming a variable. Maybe we should?

Even then, noticing an ambiguity involves thinking about two contexts where the name has different meanings. Merely having two contexts in mind (or minds, when working as a team) doesn’t mean you actually found two incompatible contexts.You have to think about all the contexts in which the element can be used. The good news is, all of these are nested and you can reach them by removing information progressively from the innermost context that you have in mind. If your code was laid out correctly, these should match scopes, classes and namespaces/packages.

Expect the Unexpected

When looking at a function declaration, there are several levels of abstraction one can use to describe what that function does.

The actual action of that function is what really happens. This includes any bugs the function may contain and any undocumented behavior that is subject to change in later versions.

The documented action of the function is what the author of the function intended to do with that function. This includes a complete description of what the function should reasonably be expected to do, what conditions may trigger an error, and what external factors may affect the outcome.

The expected action of the function is what the user of the function expects the function to do. This is the action that matters most of the time, since there are often many users for every function.

In an ideal world, all three actions would be identical: the author implemented the function to do exactly what was documented and the documentation covers all behavior and explicitly marks all unspecified elements, the user has read the documentation and understands it completely.

In the real world, those actions are all different. The difference between the actual action and the documented action is either a bug (the function does not behave as documented) or the documentation being too vague and leaving things implicitly unspecified. The difference between the expected action and the documented action happens because the user has not read, or understood, all the nuances of the function’s behavior as described in the documentation.

Breaking the Mental Model

The classic example of the latter difference in understanding is the strtolower function:

When we convert the string “integer” to upper and lower case in the Turkish locale, we get some strange characters back:

"INTEGER".ToLower() = "ınteger"
"integer".ToUpper() = "İNTEGER"

The user is not aware that strtolower depends on the current locale, because their mental model of the strtolower function turns every uppercase letter of the occidental latin alphabet into its corresponding lowercase letter in that same alphabet. Of course, this is not what happens, and there is no way of “getting” this fact straight without thoroughly reading and remembering the entire documentation of the strtolower function.

The best we can do, as function authors, is to make it woefully obvious to users of that function when they misunderstand the function.

But, you say, the only way to detect most non-trivial function misuses is through complete testing, and it’s quite probable that the user will not think of the test cases that would break their mental model!

This is correct, and this precisely why I said misunderstand and not misuse. Determining whether or not a function is used correctly is something that the user can do quite easily once they get a correct mental model of that function, so we’ll let them do exactly that. The point here is to make the function as hard to use as possible when you don’t understand it completely.

Consider the strtolower function. If you don’t understand that locale can affect the operation performed by that function, then you are going to get things wrong. A nice way to ensure you understand this is to make the locale a mandatory argument of the function. By telling the user “you need to specify a locale before using this function” you are breaking the mental model of any user that expected the function to be locale-independent, and that is a good thing.

Exceptional Situations

There is an interesting gradient of mental-model-breaking in the handling of exceptional situations:

Handling Method Always When fails
No handling (ASM, C++ undefined behavior No No
Return codes (C APIs) Weak Weak
Exceptions Weak Strong
Java Exceptions Medium Strong
Type System Strong N/A

Here, I’m discussing the ability for a given handling method of breaking an incorrect mental model in two situations : “always” means whenever the function is used, “when fails” means whenever the function is used incorrectly in a fashion that interrupts the normal course of execution.

When the function is used, the existence of exceptional situations is mentioned as weak (only in the documentation), medium (compiler error that is not very specific) or strong (specific, reliable compiler error). When a failure occurs, the result is weak (depends on user action) or strong (independent of user action).

As such, using the type system appears to be the strongest means of describing the existence of exceptional situations. How?

In a functional language, every function returns a result. There is no point in computing a result unless that result is used, which means every function result is used somewhere in the code. As such, having functions that may encounter errors return an “Error or Success” type forces the user of the function to handle the possibility of an error before they get the result.

This is precisely how Objective Caml avoids the very possibility of a “null reference” runtime error : the option type has to be explicitly turned into a value, which means that pattern matching must be used and therefore the null case has to be handled as well:

let frobnicate option =
  match option with
    | Some value -> work_with value
    | None -> work_without_value ()

Dealing with Programmers

The problem is that programmers are humans and humans are lazy. Nobody wants to spend additional time designing the type of a function just to prevent misunderstanding of that function (unless it’s an API, of course) and nobody wants to have to type an additional argument to a function.

In fact, the entire convention over configuration philosophy relies on the idea that programmers should have to make as few decisions as possible. But adding default values for every argument is dangerous if programmers are not aware that those arguments exist—choosing a sane default value implies that such a value exists and is the one most programmers have in their own limited mental models for that behavior.

And if no consensus exists, using a default value is impossible: a programmer would expect strtolower to work in the current locale by default, while another would expect strtolower to work in an invariant locale by default. Choosing a default locale means that one of these two programmers is wrong and leads to bugs. It certainly is the programmer’s fault for not reading the documentation properly, but one could argue that a successful library is one that produces great results even in the hands of less competent programmers.