Archive for the 'Functional' Category

PHP 5.3 Closures as Block Literals?

I explained earlier a few things about writing reusable CSS code, and how it interacted with PHP. Let’s start with this basic HTML for generating two columns, with the right one being flexible and resizing to fill all available space:

<div class="col2">
  <div class="col2-l">
    [Content of left column]
  </div>
  <div class="col2-r">
    <div class="col2-ri">
      [Content of right column]
    </div>
  </div>
  <div class="clearer"></div>
</div>

.col2    { }
.col2-l  { float: left ; padding: 0 ; margin: 0 ; width: 120px }
.col2-r  { padding: 0 0 0 120px ; margin: 0 ; width:auto }
.col2-ri { float: left ; width : 100% }
.clearer { clear: both }

Elementary PHP

How does this translate to PHP? Basically as a series of constants (plus documentation detailing what the column HTML should look like):

class CSS_Col2
{
  const ROOT = "col2";
  const LEFT = "col2-l";
  const RIGHT = "col2-r";
  const RIGHT_INNER = "col2-ri";
}

This serves both as documentation for the existence of this component, and as an entry in the auto-completion tool to avoid typing incorrect classes by mistake. However, you still have to get the actual code right:

<div class="<?=CSS_Col2::ROOT?>">
  <div class="<?=CSS_Col2::LEFT?>">
    [20 lines of left column here]
  </div>
  <div class="<?=CSS_Col2::RIGHT?>">
    <div class="<?=CSS_Col2::RIGHT_INNER?>">
      [40 lines of right column here]
    </div>
  </div>
  <?CSS::CLEARER?>
</div>

Did I write everything correctly? Did I forget or misplace a clearer? Forgetting about the inner container in the right column is an easy mistake, and you won’t notice it until you put a clearing element in that column. And if your script is long enough, you won’t be able to see which opening tag matches which closing tag. Surely there must be a way to improve this.

Using HTML constants

A possibility is using constants to contain the relevant HTML:

class CSS_Col2
{
  // as above ...
  const _BEGIN_LEFT = '<div class="col2"><div class="col2-l">';
  const _BEGIN_RIGHT = '</div><div class="col2-r"><div class="col2-ri">';
  const _END = '</div></div><div class="clearer"></div>';
}

This makes code shorter, and you can’t mismatch or misplace tags as easily:

<?=CSS_Col2::_BEGIN_LEFT?>
  [20 lines of left column here]
<?=CSS_Col2::_BEGIN_RIGHT?>
  [40 lines of right column here]
<?=CSS_Col2::_END?>

However, all benefits of a nice and clean HTML editor are lost, because HTML constants don’t react as code, and there is therefore no validation performed. At least Eclipse could detect mismatching open/closing tags on raw HTML. Now, if you forget to “_END” your columns, your life is pain.

Using helpers

A common technique is to use a helper function for such rendering tasks. The function accepts some arguments that let it configure what ought to be displayed, then renders the wrapper HTML and inserts the data. Staying within the previous code:

class View_Helper_Col2
{
  static function Render($left, $right)
  {
    ?>
<div class="<?=CSS_Col2::ROOT?>">
  <div class="<?=CSS_Col2::LEFT?>">
    <?php call_user_func($left); ?>
  </div>
  <div class="<?=CSS_Col2::RIGHT?>">
    <div class="<?=CSS_Col2::RIGHT_INNER?>">
      <?php call_user_func($right); ?>
    </div>
  </div>
  <?CSS::CLEARER?>
</div><?
  }
}

I used callbacks here to do the rendering, because they are the most versatile (it sure beats having to instantiate a “renderable” class for each column). This approach provides the obvious benefit that now the entire rendering is taken care of by a single function, so there is no risk of forgetting or misplacing a tag, and the auto-completion tool can now help check which arguments are provided and in what order.

Still, this means that one should create two functions to render the two columns, and that any necessary data should be made available to them (due to the absence of closures in PHP < 5.3, this often means calling a member function of a view object containing the appropriate data). In the Zend Framework, for instance, one would just write two helpers, and provide them as callbacks knowing that they will have access to the data of the current view:

<?php
  $this->render2col(
    array($this,'myLeftCol'),
    array($this,'myRightCol')
  );
?>

Of course, it’s questionable whether moving a three-line for-each loop to a helper of its own actually increases the readability of the code. If defining a new class for every view, there’s the possibility of defining the columns as member functions within that same class, but it’s still somewhat awkward.

Helpers and Closures

PHP 5.3 introduces closures and optional arguments. This means that one can now write the behavior inline:

<?php
 $self = &$this;
 $this->render2col(
   function() use($self)
   {
     ?><h1><?=esc($self->user->name)?></h1><?php
   },
   function() use($self)
   {
     ?><ul><?php foreach ($self->items as $item): ?>
       <li><?php $self->render($item); ?></li>
     <?php endforeach; ?></ul><?php
   }
 );

However, making those functions inline creates a new issue: its not so obvious anymore what exactly a function is doing (because it’s too far away from the original call to the helper function). This can be solved by using a command pattern (while simultaneously noticing that one can get rid of the use keyword by providing $self as an argument (the helper does that):

<?php
 View_Col2::start($this)

 ->left(function($view){
   ?><h1><?=esc($view->user->name)?></h1><?php
 })

 ->right(function($view){
   ?><ul><?php foreach ($view->items as $item): ?>
     <li><?php $view->render($item); ?></li>
   <?php endforeach; ?></ul><?php
 })

 ->render();

Labels are now clearly mentioned, allowing empty lines to be inserted to separate the columns without forgetting what they are, so that the code looks cleaner overall.

JavaScript (Un)Maintenance Trick

You’re hunting your codebase for bugs, and doing some refactoring and cleanup along the way. You stumble across a classic WTF, if (x == true), and decide to replace it with the shorter if (x).

The trouble is that you’re playing with JavaScript here, and the two are not the same.

This is the story of why [] == ![] is true.

First, the logical-not operator «!» is defined quite simply by the ECMA standard (§11.4.9) as evaluating its argument, converting it to a boolean, and returning true if it was false, and false otherwise. Converting the argument to a boolean is defined (§9.2) as returning true for objects. Since arrays are objects, it makes sense that ![] evaluates to false.

Second, the comparison operator «==» is defined by the standard (§11.9.3) in a rather complex way: first, if its two operands are not of the same type, some type conversion occurs. The first step is that, if an operand is a boolean, it is turned into a number. So, ![] becomes 0.

The next step is, if one operand is a number and the other is an object, to turn the object into a primitive. This conversion is defined (§9.1) as calling the [[DefaultValue]] internal method, which in turn is defined (§8.12.8) as calling methods valueOf() and toString() until one of them returns a number or string. In the case of an array, the former returns the array itself (§15.2.4.4) and the latter calls join() (§15.4.4.2), which will concatenate all values inside the array, separated by commas (§15.4.4.5).

In the case of an empty array, this yields the empty string.

The third and final step is, if an operand is a number and the other is a string, to turn the string into a number (through a lengthy process described in §9.3.1) and compare the two. An empty string becomes zero, so the comparison is true.

Does your brain hurt, yet?

Back to the original question: if([] == true) does not run, but if([]) does.

DOM removal and events

Let’s try something… go to a page with jQuery enabled (such as this one), and run the following code in your Javascript debugger console (such as Firebug):

var button =
  $('<button>Click me</button>')
  .click(function(){alert('Clicked!')})
  .appendTo('body')

In case you were wondering, this creates a brand new button, causes it to display a “Clicked!” message box when it’s clicked, and appends it to the document you are viewing.

Click on the button that just appeared : the message box appears. Not very surprising.

Now, run the following code on the same page :

$('body').html('');
button.appendTo('body')

As expected, everything on the page, including the button, disappears. However, the button is still referenced by the button variable, so it sticks around and we can append it back to the document. And indeed, it does appear on the page.

Click on the button again. This time, no message box appears.

I honestly have no idea why.

Interest(ing) rates

The most common way of investing money is putting it in a savings account. You lend a fixed amount of money to someone, and they pay interest over that money at a predetermined rate. Let’s say you lend 1,000 € at an interest rate of 3%, paid every year: at the end of the year, you would receive 30 € as payment for your lending. You would spend these on fine wine or nice clothes and wait until the next year to get another 30 €, and so on.

Savings accounts work on the basis of simple interest : what you get paid is a linear function of both time and money. Lend for half a year? 3% ÷ 2 = 1.5% Lend for two years? 3% ×2 = 6%

An important thing to bear in mind is that interest is paid at fixed intervals, for instance at the beginning of January. You don’t have to spend those 30 € : you can them on the savings account and earn simple interest on them after a year (3% of 30 € is 0.90 €).

Using this strategy, lending for two years is done at a 6.09% rate instead of 6%, because you get interest on interest. This is known as compound interest : what you get paid is an exponential function of time. Lend for two years ? (+3%)² = +6.09% Lend for three years ? (+3%)³ = +9,27%

The mathematical justification is that, with a 3% interest, your total amount of money is multiplied by 1.03 every year:

1,000 + 30 = 1,000 + 3% of 1,000 = 1,000 + 0.03 × 1,000 = 1.03 × 1,000

So, after two years, the amount is multiplied by 1.03 two times, and so on.

1,060.90 = 1.03 × 1,030 = 1.03 × 1.03 × 1,000

In short, percentages have a multiplicative effect.

And now, pop quiz : I’ve gained +5% weight over the winter holidays. What percentage of my weight do I have to lose to be back to normal ?

If you answered -5%, you missed the point. Multiplicative effect means the total change of weight would be +5% × -5% = 1.05 × 0.95 = 0.9975 = -0.25%. I would be losing too much weight !

The correct answer was 1 ÷ 1.05 = -4.76%.

Similarly, if the number of graduates of a given school increases by +10% on year one and +25% on year two, the total increase is +37.5% and not +35%.

Duality

This is where mathematicians (and computer scientists) use an interesting little concept called duality. Percentages are numbers that are easy to understand, but hard to combine. We can transform them into something that is a little bit harder to understand, but easier to combine.

The traditional way to transform multiplication into addition is to exponentiate, due to an interesting property of the exponential function:

exp(a) ×exp(b) = exp(a + b)

So, I wish to find a percentage operator (§) such that:

  • we conserve some values, 0§ = 0% and 100§ = 100%
  • applying A§, then B§, is equivalent to applying (A+B)§

Then this uniquely defines an operator which is called exponential percentage:

A§ = B%  ↔  A = 100 × log(1 + B ÷ 100) ÷ log(2)

Some common values:

0% = 0§ +100% = +100§ -100% = -∞§ 200% = 158.4§
+1% = +1.4§ +99% = +99.2§ -1% = -1.4§ -99% = -664§
+10% = +13.7§ +90% = +92.6§ -10% = -15.2§ -90% = -332§
+25% = +32.2§ +75% = +80.7§ +50% = +58.4§ -50% = -100§

percent

So, if I gained +5§ weight over the holidays, I can lose -5§ weight and be back to where I started, and if a number increases by 10§, then by 25§, it increases by 35§ overall.

And of course, a yearly interest rate of 4.2§ = 3% compounded over ten years is 42§ = 34%.

No Free Lunch

Normal percentage rules make compounding hard, but it’s reasonably easy to estimate a percentage based on a fraction. Exponential percentage rules make compounding easy, but evaluating a percentage based on real figures is harder.

In practice, compounding happens less often than evaluating, so humans use normal percentage rules. And computers are good at compounding through multiplication, so they don’t need exponentiation.

Duality does have some other uses, though. For instance, there’s the duality between two representations of complex numbers:

a + ib = r exp iθ

The cartesian (a,b) notation makes it easier to add numbers, but multiplication is harder:

a + ib + c + id = (a+c) + i(b+d)

The polar (r,θ) notation makes it easier to multiply numbers, but addition is harder:

r exp iθ × s exp iφ = (r × s) exp i(θ+φ)

For mathematically-oriented computer scientists, duality is a gold mine, because it lets one reduce a complex problem in one area to a simpler problem in another area (whether simpler means faster, as in the case of FFT, or easier to think about)..

The Law of DSLs

There’s one common duality that is fundamental in the computer world: the correspondence between data and code. In a fit of narcissism, let me sit wisely atop a tall mountain to announce Nicollet’s Law of Domain Specific Languages:

Any sufficiently complex data processing algorithm is as an interpreter for a small domain-specific language, and the data being processed is a program executed by the interpreter.

In some cases, this law only complicates things further. In many cases, however, the different angle it provides leads to many advantages, one of them being to transform a non-programming concept (such as an accounting file format) into a concept programmers are familiar with (a programming language).

A minimalist language design culture is enough to grasp several interesting concepts about executing code, which can be quite handy when processing data:

1. Compile to Bytecode

Interpreters don’t execute a string of characters. They tokenize that string, turn the tokens into an abstract syntax tree representing operations, functions and variables, then turn that syntax tree into a sequence of small, executable operations. That sequence is then fed into a virtual machine (or further compiled to machine code) to perform the actual operations.

If the input data for your algorithm is very complex, you can begin on the other side: what will the algorithm do with the data? Will it be inserting the data into a database? Constructing a data object from bits and pieces? What you are looking for is a set of atomic operations you can apply to generate the result. Implement these operations, then start working on a translation algorithm to turn the input data into such operations.

There are several common and friendly representations for such atomic bytecode:

Instruction lists are executed in order. This is your classic assembler listing, without the jumps. A typical “parse file and insert into database” algorithm would generate such an instruction list, and every instruction would be an INSERT, DELETE or UPDATE. Works best when you can read the data and generate the instructions in the right order: if you cannot get the list in the right order from the start, consider another approach.

Dependency graphs work like makefiles: you have several instruction lists floating around with relationships between them, indicating that one list has to be executed before another. A topological sort of the graph results in a single classic instruction list you can execute. A multi-file import, where some files contain data needed in other files, can be the way to go.

Nested scopes are the typical extension to instruction lists: every item in a list can be either an instruction, or another list, possibly tagged with some data. This could be a conditional (if this condition is true, execute this list), a loop (though it is best to avoid these) or a context (a “polygon” scope contains “insert vertex” operations that apply to that polygon). You can even allow variables in a let-in fashion (of which the polygon example above is just a special case) ! Note that nested scopes can be easily represented as XML.

2. Static Analysis

A side-effect of compiling to bytecode is that you get to process the entire file before you actually perform the intended operations. This makes a rollback easier if you notice that there’s an error on the last line of the file: if you make sure that no atomic operation in your target language can fail due to bad input (such as incorrect data values), then you can check your input data for correctness without doing anything to your program state.

Even better, if your compilation process is cheap (linearly traverse a file for parsing) and you have heuristics for predicting how much time and resources your individual instructions require, then you can try to accurately predict the needs of the entire process.

Static analysis also means you can optimize. If, for instance, you’re inserting data into a database and need to resolve names or keys frequently (such as “add this item to list #732″), you can easily construct a table of needed keys (that you can get in one query when the processing starts) using the dependency graph approach.You can also optimize resource allocation by using common register allocation techniques: sort your dependency graph to keep as few resources in memory as possible at any given time.

3. Caching

Try to perform most of the processing offline.

For instance, if you frequently “apply” one file to another, such as a nearly-constant “list of categories” file used to resolve the “category” key in a daily object import, you can benefit from compiling the nearly-constant file to an easily loaded, easily applied format.

You see a cached dictionary that maps keys to categories? I see a DSL that allows dictionary literals as part of the language, and a source file that contains a literal mapping keys to categories, with an interpreter that can apply constant propagation to dictionaries.

Another benefit is when applying changes to mission-critical software. Inserting lots of data into a web database can create a heavy load on the server and make the site unavailable to visitors. It might therefore be preferrable to pre-compile the imported data into requests through a process that keeps a light load on the server, then run the requests.

Besides, with proper nested scoping, you can slice an import into several transactions. This keeps the lock count low, allows spreading the transactions over time to reduce the load, and lets you resume the import process if, for some reason, it gets interrupted.

December 2009 PDF Vulnerability

All file formats follow the same evolution.

  1. They start by grouping together some static content, with some nifty features for presenting and editing that data. Think text files, bitmaps, RTF documents… The file format is reasonably easy to understand, and the reader/writer is so simple that it would take a bad programmer to create vulnerabilities.
  2. Then, they start including plug-ins that let them handle more and more types of contents. This lets you include an image inside an HTML page or an Excel spreadsheet in a Word document. This relies on many plugins for getting things right. It sometimes happens that a given plugin contains a security fault that can then be exploited, for instance Internet Explorer had an issue with images in PNG format. The user would visit a page, that page would display an image, and the computer would be contaminated.
  3. Finally, they need to become interactive, so they include a scripting language of some sort. Excel has a macro system that uses Visual Basic, HTML includes Javascript…

The PDF format followed the same process to end up where it is now. In addition to any static document data (text, vector and raster images) and extended content (flash animations, videos, reader extension signatures) a PDF also contains short JavaScript that let authors create interactive documents. This means a PDF document on your desktop can:

  • Accept user input (such as checkboxes or text fields). The input can be saved to the disk if the reader supports it and allows it (Acrobat Reader, used by the vast majority of computer users, only allows saving a file if its author purchased a reader extensions license and signed the PDF file with it).
  • Change its layout at will, for instance displaying a “spouse” page only if the “spouse” checkbox was ticked.
  • Be cryptographically signed, and display information about who signed it. This kind of signature is actually accepted as valid legal proof in many countries.
  • Compute a scannable bar code from user input, so that it can be printed, then scanned on the other side with reduced error rates.
  • Send data over the internet. It can even send itself as an attachment to an email.

Needless to say, with all these features, there are inevitably going to be some exploitable security issues in the mix. Being a popular program, like Acrobat Reader, only increases the number of black hat hackers looking for vulnerabilities. One of these is the recent CVE-2009-4324 from December 2009. There are many types of vulnerabilities, their common feature being that they end up executing arbitrary operations on the computer (as opposed to the safe operations Acrobat Reader normally allows). These operations are usually to download or install trojans, so that the attacker can gain complete control over the computer.

CVE-2009-4324 is of the use-after-free kind. In short:

  • it creates a resource (which uses some memory),
  • it frees (destroys) the resource to recycle its memory,
  • it writes something to that memory,
  • it attempts to use the resource

Normally, the program should stop at step four and say “you can’t use the resource, it’s been destroyed”. A bug can cause it to believe that the resource is still there. The programmer probably assumed that the memory still contained a valid resource and did defend against the memory containing something else… and accessing that as if it were a valid resource executes some code that the attacker wanted to execute. Bingo.

In the case of CVE-2009-4324, this happens as part of the Doc.media.newPlayer method which, for performance reasons, was not completely implemented in Javascript—a bug in some Javascript code can cause the document to misbehave, but it cannot do anything that the Javascript couldn’t do on its own. Those parts that were written in a lower-level language, with access to the computer, contained the exploited bug.

The bug causes the processor to start executing code at a different memory location. In an ideal hacker world, that location would be precisely where some nasty code is present. Buffer overflows, when used to rewrite pieces of the stack, do allow such deterministic jumps. However, CVE-2009-4324 only allows a jump to an undetermined location.

The hacker solution is to use heap spray. The basic idea is that you have a short piece of code you want to execute (the payload). You create a block from that payload by adding no-ops (machine instructions that say “skip me”) before the payload. Then, you create lots of these blocks in memory, and trigger the exploit.

The exploit causes the computer to jump to an undetermined memory location. If it falls within the no-op section of any of the blocks you’ve created, you win: the computer skips over the no-ops, reaches the payload and executes it. If not, the program will crash. Too bad…

Quick Test

Here’s a very simple question:

How many times can you subtract 5 from 73, and what is left ?

Find out what your answer means by clicking here.

  • An imperative programmer answers, “you can subtract it 14 times and the remainder will be 3.”
  • A functional programmer answers, “you can subtract it as many times as you wish, and you always get 68.”

Javascript Tips, Part 1

It is advised, when deploying a live website, to group all your Javascript files into a single one, and reduce its size by removal of all unnecessary clutter. You obviously get a few nasty surprises the first time you do this, such as dead code (or debug-only code) being included or two incompatible files being included together. The usual answer to this, like with any integration-related issue, is to perform continuous integration, and generally make sure during development that the final product works as it should.

Tip #1 — Keep it all together

I generally keep all of my javascript together in one single file, which is accessed through http://domain/all.js or something like that. Of course, that file is in fact generated from a well-tended garden of neatly arranged source files, but the intent is that every single visit on the development website will bring along all the code that could possibly conflict with it, to identify issues as soon as possible.

This means I have access to a Javascript preprocessor, because generating a single file already involves reading all the files.

Tip #2 — Conditional inclusion

Difficult bugs appear on platforms with  no debuggers, which means you need to be able to track the progress of your application with logging. You don’t want to have those log lines appearing on the live website, though. In the same way, runtime assertions are useful while developing, so you get a clean “Expected string, got array” error message right after you call a function with incorrect parameters, instead of a “object has no member charAt” ten stack layers deeper. Yet, assertions use up room, and tend to die in a flashy and unprofessional way on a live website.

An overly intrusive preprocessor can prevent your code from running if not preprocessed. It also means whatever tool you use to edit Javascript source cannot recognize the syntax anymore, which is bad. The solution (which I learned from the PHP preprocessing techniques of the folks at IntellAgence) is to embed preprocessing information in comments.

Need some logging?

/*<*/ log('Hello'); /*>*/

Need to check something at runtime?

/*[*/ Assert.areEqual(x,y); /*]*/

A simple regular expression can remove these from the final code, but you can also selectively disable them by removing the second slash of the first comment :

/*[*  Assert.areEqual(x,y); /*]*/

Tip #3 — Named anonymous functions

Quick reminder: Javascript allows you to define anonymous functions using a lambda-like construct:

function(x) { return x + 1 }

This acts like a function literal (pretty much like a string literal works for strings, or an array literal works for arrays). In fact, if you’re doing any kind of serious Javascript, you probably have anonymous functions all over the place, such as callbacks to asynchronous operations:

$('button').click(function(){ alert('Hello') });

Or you might be defining member functions for your classes:

className.prototype.setFoo = function(x){
  this._foo = x;
};

When you get a runtime error, the debugger displays a stack trace containing the names of the functions and the places where they were called. So, if you were using many anonymous functions, you get neck-deep into layers of “anonymous” on your stack trace. Sure, you can click on every single one of them to understand what is happening, but it is way faster to get that knowledge from looking at meaningful names.

Javascript allows you to give names to anonymous functions. In fact, this is how recursive functions can be defined as lambdas (a feature that OCaml lacks, for instance). So you can write the following and see it appear in a stack trace:

$('button').click(function _onButtonClick(){ alert('Hello') });

className.prototype.setFoo = function _className_setFoo(x){
  this._foo = x;
};

I start all names with underscores, because a simple /function _[A-Za-z0-9_]*/ in my preprocessor can identify and remove all those names when I don’t need them.

Tip #4 — Are your callbacks called?

The Javascript function model lets you write a function that forwards whatever it was doing to another function. You can even log some information along the way…

function trace(f) {
  return function () {
    log('%o.%s(%o)',this,name(f),arguments);
    return f.apply(this,arguments);
  }
}

The preprocessor instructions for conditional removal let you do this only in specific circumstances:

className.prototype.setFoo = /*[*/ trace /*]*/
(function _className_setFoo(x){
  this._foo = x;
});

It can also be quite practical to use a variant of the trace function that stores all functions it traces in an array, and removes them from the array the first time they are called. This can be useful when debugging calls to asynchronous functions that you write, because you might forget to call the callback function when you are done, and these bugs are otherwise quite hard to identify.

Tip #5 — Do not rely on closures too much

Closures in Javascript means you can write this, and have the inner function use the variable from the outer function :

function outer(a) {
  var b = a + 1;
  return function inner(c) {
    return b * c;
  }
}

Closures are extremely useful. However, Javascript has a nasty habit of creating variables at global scope whenever you try to write to a variable that doesn’t exist. For instance, the following function is not re-entrant because an accidental typo makes it use a global variable :

function person() {
  var aeg = 18;
  return {
    setAge : function(a) { age = a },
    canDrink : function() { return age > 20 }
  }
}

A good solution, I believe, is to rely on object members instead of variables. Inappropriately using an object member is harder than inappropriately using a local variable, because the former does not default to “look in the global scope”. Besides, every function beyond a certain level of inner state complexity deserves to be adapted to a more structured object style with its data as private members (if only because this is way easier to explore with a debugger).

Gremlin : jQuery Growl

I have uploaded Gremlin, a simple jQuery-based Growl system, for elegant page-wide notification needs. Check it out, it’s free and built to be simple.

Javascript signals

Signals operate as a simple way of decoupling dependencies within a project, by allowing caller-callee relationships through an interface that makes both parties anonymous. Assuming a shared signals object is provided, the receiver registers itself on that object:

signals.output = function(text){ alert(text) };

And the sender uses the registered channel to remotely execute that function:

signals.output('Hello');

Signals are the functional equivalent of object-oriented inversion of control, a technique that allows users to configure the behavior of third party code without having to modify it. This is done by removing any explicit dependencies of the third party code on specific behavior units, such as “output a piece of text”, and injecting those dependencies back from the outside as an object or set of objects which hide the actual implementation of those behavior units. Basically, we’re replacing:

function frobnicate(a,b) {
  foo(a);
  bar(b);
  alert('Success');
}

frobnicate(1,2); // Can't prevent the alert box from appearing!

With the slightly longer but easily configured:

function frobnicate(a,b,output) {
  foo(a);
  bar(b);
  output('Success');
}

frobnicate(1,2,function(t){alert(t)}); // Original behavior
frobnicate(1,2,function(){}); // Muted function
frobnicate(1,2,function(t){console.debug(t)}); // To firebug console

Since a given piece of code might depend on several distinct behavior units, I use a record to transmit all that behavior as a single argument. This results in the classic “configure my library with your options object” that can be found, among other places, in jQuery.

This simple approach causes a small number of difficulties:

  • If I want to use a slightly different version of a signals object for another part of the program, I have to manually create a copy of the object and change the copy (basically the equivalent of a pure functional object mutation).
  • In some situations, I might want to handle several callbacks for a single signal. The current approach only lets me define a single function for a given signal.
  • Some functions of the signal set (such as sending a form through AJAX) might rely on other functions of the signal set (display an error message) to handle their own behavior unit dependencies, and I would like those functions to automatically have access to the signal set they belong to, dynamically.

This leads me to a subtly different implementation of signals:

signals = (function(){
  s = function() { this._c = s; };
  s.prototype.channel = function(c) {
    var h = [],
        s = function() { for (var k in h) if (h[k]) h[k].apply(this,arguments); };
    s.bind = function(f) { h.push(f); return h.length-1; };
    s.unbind = function(f) { h[k] = null; };
    return this.set(c,s);
  };
  s.prototype.set = function(n,v){
    var i = function(){ this._c = i; };
    i.prototype = new this._c();
    i.prototype[n] = v;
    return new i();
  };
  return s;
})();

This small class encapsulates pure functional mutation semantics by means of its set function:

var signals = new signals();
var initial = signals.set('xxx',100);
var final = initial.set('xxx',200);
console.log(initial.xxx + ' ' + final.xxx); // Outputs '100 200'

This small piece of behavior is in itself quite helpful, but it gets better: if a function is added to the object, it remains there but is always executed within the context of the current object and therefore has access to its actual values.

var signals = (new signals()).set('show',function(){console.log(this.xxx)});
var initial = signals.set('xxx',100);
var final = initial.set('xxx',200);
initial.show(); // Displays 100
final.show(); // Displays 200

Last but not least, it’s possible to create a full communication channel that can be connected to several receivers and forwards its arguments to all receivers.All receivers are called with the signals object as their context, which lets them access it and behave accordingly.

var unreadMessages = 0;
var signals = (new signals()).channel('setUnread');

// Update the number of unread messages, notify user if they have
// new messages.
signals.setUnread.bind(function(unread){
  if(unreadMessages < unread) this.notice('You have new messages!');
  unreadMessages = unread;
});

// Update all places that display the number of unread messages
signals.setUnread.bind(function(unread){
  $('.unread').html('Messages'+(unread > 0 ? ' ('+unread+')' : ''));
});

// When at page scope, notices are printed by growling
var global = signals.set('notice',growl);
global.setUnread(10);

// When inside a smaller scope, such as a component, display notices in
// a dedicated location
var local = signals.set('notice',function(arg){$display.html(arg)});
local.setUnread(15);

Last Minute Skin

Right now, we render our page layout on the server, thus wasting precious bandwidth sending the same header, footer and menus all over again every single time. AJAX techniques have evolved to reload only the inner part of every page, but they require clever URL manipulation or ‘back’, ‘refresh’ and bookmarks won’t work, and they impose strong constraints on page layout and on the way the server responds to requests.

Why not do it the other way around? Have every page include the same layout-generating JavaScript file (kept in the browser cache for optimum performance) ! This is the idea behind the last-minute-skin pattern.