Monthly Archive for February, 2010

Reusable CSS

Woe unto CSS, for it provides no refactoring-friendly tools! The CSS beast has neither functions nor variables, and its definition of inheritance is perverted beyond words. Pain and suffering await those who hope to keep their CSS from one project to the next, or even share the CSS between pages on a single website!

Consider two simple pages: the home page has a small navigation bar (selected by #navig) at the top of the screen, while the catalog page as a larger navigation (still selected by #navig). Each page includes a different layout.css stylesheet, so everything’s fine. Except that now, anything defined in a layout has to be copied over by hand to the other layouts if you want to reuse them. Ouch.

Does that example sound extreme? It certainly is! But the danger of page-specific stylesheets remains: if you won’t be stepping on your own toes with something as trivial as #navig, perhaps .book will mean two different things on two different pages?

Rule Zero : Keep all your CSS Together

This might seem a bit harsh, especially if you have truckloads of CSS floating around and don’t want to slow down the initial loading time of your page, or the time spent resolving collisions. However,

  • This rule will make it easier to factor out common bits of CSS, leading to an overall smaller set of stylesheets.
  • The number of HTTP requests matters as much as the bandwidth, so delivering all your CSS as a single, minified, gzipped blob is often a good performance idea.
  • The entire point is to make it harder to create page-specific rules, so that you don’t make a rule page-specific by mistake, and strive to make most of your rules page-independent.

I usually place all of my CSS in correctly named files in a directory on my server, then have the server generate a single, all.css master file that @imports the other stylesheets by path. This means Firebug’s CSS browser will correctly identify the source file for any given rule. When the code moves to a production server, the auto-generated master file becomes a pre-generated/minified/gzipped resource, and can even be moved to a CDN for improved performance.

On the other hand, keeping all your code in one place will only help you see collisions, it will not actually help you solve them.

Fortunately, we can look to other languages for tips and trick on how to make code easier to reuse. The fundamental observation is that you cannot use something if you don’t give it a name. One would expect CSS identifiers and classes to serve the same function, and indeed it does work in simple cases:

a.important { font-weight: bold }

Now, you have the important «function», that you «call» on an anchor element to make it appear important. Bam! Instant reusability. Using an identifier instead of a class still allows reuse on distinct pages, but restricts reuse within a single page.

Rule One : Document your «Functions»

You cannot reuse code if you cannot find it, and even if you don’t forget about it someone else on the team might be completely unaware that it even exists. So you should somehow document that the important class exists. My personal, PHP-friendly preference, is to have a “Css” class with all those nice classes available:

class Css
{
  /* a.important : make a link important */
  const IMPORTANT = "important";
}

Then, you can reuse them when you see fit to do so:

Click <a href="<?=$url?>" class="<?=Css::IMPORTANT?>">here</a>

That’s just personal preference—any way of documenting your CSS classes is fine as long as it’s somewhere everyone can see it. In fact, I have a nice set of PHP helpers lying around to bind jQuery UI CSS effects to my code, thereby documenting what jQuery UI can do without having to dive into the stylesheets every single time.

The real problem appears when you have more than one «argument». A typical example is the list of links with a “selected” link: the graphical effect applies to the list, to the elements of that list, and to the content of those elements, which leads to several rules selecting different elements.

ul#navig { margin: 0 ; padding: 0}
ul#navig li { list-style-type: none }
ul#navig li.selected a { font-weight: bold ; color: black }

This kind of structure cannot be documented simply by stating that the ul#navig element is going to become a pretty list, because without the li.selected in there there will be no «pretty» worth mentioning.

I document this as follows:

/*
  <ul id="navig">
    <li><a>Item</a><li>
    <li class="selected"><a>Item</a><li>
    <li><a>Item</a><li>
  </ul>
*/
ul#navig { margin: 0 ; padding: 0}
ul#navig li { list-style-type: none }
ul#navig li.selected a { font-weight: bold ; color: black }

Why not document it in the PHP code, then? IMO, a CSS designer to write a quick const FOO = "bar"; line in PHP, but not an HTML helper that turns an array of links into pretty list HTML. CSS designers write the CSS (with documented HTML) and PHP developers turn that into HTML helpers.

</acronym soup>

Another important element of code reuse is the notion of encapsulation, and in particular the existence of “private data” that is part of the program, but can only be accessed by some parts.

There is no such thing with CSS. There are two reasons for this. The main reason is that being sloppy with selectors is commonplace:

/*
  <div id="userList">
    <ul class="users">
      ...
    </ul>
    <a>New</a> |
    <a>Edit</a> |
    <a>Delete</a>
  </div>
*/
#userList a { color: #FF9900 ; text-decoration: none }
#userList a:hover { text-decoration: underline }

The three links in the user list component («new», «edit» and «delete») will appear in orange without underlining, as expected and documented. The unexpected and non-documented consequence of this code is that all links within the list of users will be orange without underlining as well.

Rule Two : Only Select what you Need to Select

The typical consequence of sloppy selectors is that «insert component A into component B» operations utterly destroy the formatting of component A. The typical designer reaction to such graphicalypse is «Darn, component B destroyed some property of component A, so let’s add some rules to component A to reverse the damage!»

Bad idea. It makes the code longer, and only hides the actual problem (along with any symptoms that only appear in specific cases). The real solution is to make sure selectors only select what they need to select.

One way of doing so is to use the «>» selector, as it restricts the selection to only children of the initially selected element. This would work:

#userList > a { color: #FF9900 ; text-decoration: none }
#userList > a:hover { text-decoration: underline }

Of course, it wouldn’t work in IE6, but who cares about IE6 anymore?

The general approach is to use specific classes for those elements that must be affected:

/*
  <div id="userList">
    <ul>
      ...
    </ul>
    <a class="userList-link">New</a> |
    <a class="userList-link">Edit</a> |
    <a class="userList-link>Delete</a>
  </div>
*/
a.userList-link { color: #FF9900 ; text-decoration: none }
a.userList-link:hover { text-decoration: underline }

If anyone uses that userList-link class in their code (and your naming conventions were clean enough), they had it coming.

Rule Three : Choose Proper Naming Conventions

It is quite important to remain consistent in your naming practices, especially since you now need to identify, for any given identifier and/or class:

  • If it represents a «function» (#userList), or if it helps select a specific «argument» (.userList-link).
  • In the latter situation, what function the argument corresponds to (so that you can look for its definition).

My preference is to use camelCase names (classes or identifiers) for functions, and camelCase-camelCase names for arguments, where the first half is the name of the function. The CSS would then be gathered in a camelCase.css stylesheet named after the function, with a documentation of the expected HTML at the top, hence making it much easier to find and reuse.

Now that you have access to functions, you will probably want to use them to implement reusable «components» — standalone pieces of HTML and CSS that represent atoms of information.

At some point, you will have to make components interact (if only to respect each other on the page layout). All of this will be hell if component A uses normal block layout rules, component B is floating to the left and component C is positioned absolutely.

Rule Four : a Component Should only Care about its Inner Layout

As soon as a component starts to care about outer layout concepts such as margin, position, floating or clearing, you will be in a world of pain. This is because such concepts depend on where the component appears, and as such are not easy to reuse.

I split my CSS code into components and bones:

  • Components. These are reusable atoms. They do not care about their outer layout at all, so they never specify anything like margin, position, floating, clearing, display mode or anything that might cause them to interact differently with their surroundings on the page.

    They may specify a width and height if they wish, but it is discouraged (a component that can adapt to any geometry is easier to use). They can specify anything they want in terms of border, padding, font, color, background, font, and any inner properties they need.

  • Bones. These are elements found inside the components that handle the layout of the component contents themselves. They can and should make appropriate assumptions about what bones can be found within a component and how they should interact to result in the layout you need to see.

A nice finishing touch is to make the component overflow : hidden, because the last thing you need is a component’s skeleton sticking out from its skin and interacting with other elements.

I repeat: never allow the contents of a component to stick out of that component!

In particular, if you have a component with floating elements inside, make sure you add a clearer element at the bottom of the component to have it resize with its contents.

In practice, I assume every function argument to be a bone, and every function to be a component. The situations where a function acts as a bone are so rare, and the results so difficult to reuse (so you’ve added a float:left to an element, where are you going to put it?), that I don’t really take them into account. The Component-Bone approach tends to solve almost everything elegantly, as long as you’re clever about where a component begins and a bone ends.

For instance, if you’re laying out a list of comments for a blog, you are probably going to have a «comment list» component with «comment» bones that are laid out on top of one another with appropriate margins, borders and paddings. The contents of every «comment» bone will be a «comment» component, with bones representing the picture, name, date and comment body, laid out cleanly without that component.

Whether the .commentList-comment is placed on the same element as .comment is something you can decide for yourself. What is essential is that, in order for the comment style to be reusable independently of the comment list style, all outer layout information should be in .commentList-comment, not in .comment.

Good.

Now, before I finish, do you remember when I said earlier that component B could be mangled by component A for two different reasons? The second reason happens to be inheritance. Everyone knows inheritance is bad for reuse. Right?

What happens is that, if you define a font size, color or family in a given element, then all descendants of that element will get the same font size, color and family (unless some CSS rule changes them). That’s inheritance: the value of the property in the child element is inherited from the parent element.

Rule Five: Only Change Inheritable Properties on your own Content

It’s impossible to define the entire list of inheritable properties at the root of every single component in your web side, however convenient it may be. Keeping everything in sync is very difficult, if not impossible. It is far easier, by comparison, to restrict such changes to only those areas of a component where the content is closely controlled and guaranteed not to contain any other components.

I believe there are basically three kinds of areas in any given page that are actually worth being paid attention to. These are:

  • Layout areas. These are those component-in-component-in-component places where touching an inheritable property can get you killed annoyed.
  • Text areas. Those contain no components, but they might still contain paragraphs, links, headings, images in a typical «rich text editor» fashion. If you change one property (such as the color of text), be ready to change all the related properties (the color of links) to keep a consistent appearance.
  • Line areas. These contain a short bit of text without any other tags. You don’t have to worry about changing properties here.

Every component should document, for every piece of content that should be filled from outside the component, whether it is a layout, text or line area. For instance:

/*
  <div class="comment">
    <span class="comment-author">...</span>
    <div class="comment-contents"><p>...</p></div>
    <div class="comment-reply">
      ...
    </div>
  </div>
*/

Here, a span (can only contain inline elements or text) represents a line area, a div-with-paragraph represents a text area (may contain several paragraphs, of course) and a normal div represents a layout area. This tells me, for instance, «don’t even think about putting a component in the comment contents, or I’ll clobber their stylesheet beyond recognition.»

Depending on the kind of web site you are building, other kinds of areas may be useful to you, such as forms.

JavaScript (Un)Maintenance Trick

You’re hunting your codebase for bugs, and doing some refactoring and cleanup along the way. You stumble across a classic WTF, if (x == true), and decide to replace it with the shorter if (x).

The trouble is that you’re playing with JavaScript here, and the two are not the same.

This is the story of why [] == ![] is true.

First, the logical-not operator «!» is defined quite simply by the ECMA standard (§11.4.9) as evaluating its argument, converting it to a boolean, and returning true if it was false, and false otherwise. Converting the argument to a boolean is defined (§9.2) as returning true for objects. Since arrays are objects, it makes sense that ![] evaluates to false.

Second, the comparison operator «==» is defined by the standard (§11.9.3) in a rather complex way: first, if its two operands are not of the same type, some type conversion occurs. The first step is that, if an operand is a boolean, it is turned into a number. So, ![] becomes 0.

The next step is, if one operand is a number and the other is an object, to turn the object into a primitive. This conversion is defined (§9.1) as calling the [[DefaultValue]] internal method, which in turn is defined (§8.12.8) as calling methods valueOf() and toString() until one of them returns a number or string. In the case of an array, the former returns the array itself (§15.2.4.4) and the latter calls join() (§15.4.4.2), which will concatenate all values inside the array, separated by commas (§15.4.4.5).

In the case of an empty array, this yields the empty string.

The third and final step is, if an operand is a number and the other is a string, to turn the string into a number (through a lengthy process described in §9.3.1) and compare the two. An empty string becomes zero, so the comparison is true.

Does your brain hurt, yet?

Back to the original question: if([] == true) does not run, but if([]) does.

How to build a page client-side?

The basic philosophy of jQuery is to start with some existing HTML sent over vanilla HTTP by the server. That HTML should be all you need (so that people without a JavaScript-enabled browser can still use the web site). Then, jQuery enhances that HTML by adding new behavior (usually changing the properties of existing elements, sometimes adding new elements).

This is very useful for small pieces of behavior, but writing complete and complex components is hard for several reasons:

  • A partial view strategy is required on the server side to insert the appropriate HTML in the appropriate location (as opposed to leaving an empty hole and having the component generate its own HTML).
  • If the behavior of your component is complex, then there will be a lot of parsing going on. A typical example would be sorting a table by a “date” column—since the date format in itself cannot be parsed (culture-dependent and may contain “Yesterday”, “13 seconds ago” and similar shortcuts).
  • Sometimes, the server needs to add information that is not visible, but is needed by the JavaScript. The format for sending this data (attribute, hidden field…) is difficult to document and type-check.
  • Selecting precisely the right fields in a blob of HTML, without hitting any others, is hard, especially for components that may later contain sub-components. Class-based selection is slow, id-based selection involves heavy logistics to move the identifiers around, and complete traversal takes a while and breaks if the HTML changes.

My preferred approach to JavaScript components is to receive JSON-formatted data from the server (easy to parse) from which I construct the DOM elements I need and capture them at the same time.

var $comment = $('<div><img/><span/><div/></div>')
  .addClass("comment");

var obj =
{
  $self : $comment,

  $img  : $comment.children('img')
          .attr('src',data.imgUrl),

  $name : $comment.children('span')
          .text(data.authorName)
          .addClass('authorName'),

  $body : $comment.children('div')
};

$.each(data.text,function(k,t){
  $('<p/>').text(t).appendTo(obj.$body);
});

return obj;

The point is that you then have access, through the returned object, to all the relevant elements within the comment, so that you may target them with effects without any risky selector-based magic. Besides, if the HTML format of comments changes, you will only have to change the code above and nothing else.

And of course, using text() escapes any dangerous HTML you might have.

To make the above appear in your code, all you have to do is:

var $commentsList = $('#my-comments-list');

$.each (comments, function(i,c){
  var obj = $comments[i] = renderComment(c);
  obj.$self.appendTo($commentsList);
});

This is usually where you hit a performance wall, because this is one of the slowest ways of using jQuery on a web page.

I’ve been in this situation recently on a smallish website that basically displays a list of contacts invited to various events as a 10-column/300-row table that includes additional functionality such as:

  • Dynamically add or remove new rows (with server-side confirms)
  • Rows are grouped together, and groups can be collapsed and expanded
  • Clicking on rows opens a modal editor, modifications are propagated back to the table
  • The data and formatting for certain rows depend on some other rows

The initial approach was exactly as described above: every cell was constructed as $('<td/>'), classes and attributes were applied to it, then all cells were inserted into rows constructed as $('<tr/>'), and these in turn were appended to the table tbody. Since some parts of the table were clickable to achieve various effects, jQuery’s click() function was used to add the appropriate event handlers, and the event handlers were closures that contained all relevant information about what row had to be collapsed or what element had to be removed.

The average time for rendering all of this was a solid 2200ms on Firefox 3.5, which felt about as dynamic as a dead tortoise nailed to a slab of concrete. For comparison purposes, rendering the data server-side and sending it to the client took about 390ms on average (arguably, the server would have scaling issues as it would have to render the HTML for all clients, but still).

2200ms means about 7ms per row. The problem here isn’t that the jQuery code is slow, but rather that it’s executed so many times to add up to a pretty large number.

My first attempt to improve performance was to avoid constructing rows cell by cell, instead building the final HTML of the row in one shot and then selecting clickable elements inside the row through their class to apply event handlers. Rows were then inserted into the table body using jQuery’s DOM functions. The new rendering time was 1800ms, which was not as good as I hoped my improvement to be.

The second step was to move away from selecting clickable elements to apply event handlers. This meant that I could either insert the event handler code in the HTML (but this meant no closures, so I would have to rely on global, non-garbage-collected behavior) or add a click event to the entire table and determine what element had been clicked (and parsing the DOM for information about what to do with the click, which was annoying).

I went with the first way, rewriting my code as global handlers and eliminating all the select-child-with-class overhead. Rows were still constructed independently and inserted independently. The improvement was sensible, as the rendering time was then 980ms.

The last wave of optimizations consisted in making sure the HTML for the entire table body was generated in one shot and concatenated as an array (using [a,b,c].join('') instead of a+b+c). This creates 5223-element array, concatenated into a string containing 72357 characters, which is then inserted into the table body using jQuery’s html() function. The entire process, including preliminary processing of the data to be displayed, takes about 160m (a 13.7× performance increase).

The change was mostly moving from this design pattern:

function renderRow(data)
{
  $tr = $('<tr/>');

  $('<td/>')
    .addClass('name')
    .append($('<a/>')
      .text(data.name)
      .click(function(){ frobnicate(data.id); }))
    .appendTo($tr);

  // ...

  return $tr;
}

To this one:

function renderRow(data,html)
{
  html.push(
    '<tr><td>',
    '<a href="javascript:frobnicate(',
    data.id,
    ')">',
    esc(data.name),
    '</a></td>',
    // ...
    '</tr>'
  );
}

Again, this is an extreme situation where page-generation goes way out of hand because a lot of rows are generate—the net benefit, as far as rendering a single row is concerned, is around 6ms. If your page contains only a small number of complex components, you can ignore the performance issues to get the components done, and only optimize if it turns out to be noticeable.



706 feed subscribers
(readers who polled a feed this week)