Monthly Archive for December, 2009

Javascript Tips, Part 1

It is advised, when deploying a live website, to group all your Javascript files into a single one, and reduce its size by removal of all unnecessary clutter. You obviously get a few nasty surprises the first time you do this, such as dead code (or debug-only code) being included or two incompatible files being included together. The usual answer to this, like with any integration-related issue, is to perform continuous integration, and generally make sure during development that the final product works as it should.

Tip #1 — Keep it all together

I generally keep all of my javascript together in one single file, which is accessed through http://domain/all.js or something like that. Of course, that file is in fact generated from a well-tended garden of neatly arranged source files, but the intent is that every single visit on the development website will bring along all the code that could possibly conflict with it, to identify issues as soon as possible.

This means I have access to a Javascript preprocessor, because generating a single file already involves reading all the files.

Tip #2 — Conditional inclusion

Difficult bugs appear on platforms with  no debuggers, which means you need to be able to track the progress of your application with logging. You don’t want to have those log lines appearing on the live website, though. In the same way, runtime assertions are useful while developing, so you get a clean “Expected string, got array” error message right after you call a function with incorrect parameters, instead of a “object has no member charAt” ten stack layers deeper. Yet, assertions use up room, and tend to die in a flashy and unprofessional way on a live website.

An overly intrusive preprocessor can prevent your code from running if not preprocessed. It also means whatever tool you use to edit Javascript source cannot recognize the syntax anymore, which is bad. The solution (which I learned from the PHP preprocessing techniques of the folks at IntellAgence) is to embed preprocessing information in comments.

Need some logging?

/*<*/ log('Hello'); /*>*/

Need to check something at runtime?

/*[*/ Assert.areEqual(x,y); /*]*/

A simple regular expression can remove these from the final code, but you can also selectively disable them by removing the second slash of the first comment :

/*[*  Assert.areEqual(x,y); /*]*/

Tip #3 — Named anonymous functions

Quick reminder: Javascript allows you to define anonymous functions using a lambda-like construct:

function(x) { return x + 1 }

This acts like a function literal (pretty much like a string literal works for strings, or an array literal works for arrays). In fact, if you’re doing any kind of serious Javascript, you probably have anonymous functions all over the place, such as callbacks to asynchronous operations:

$('button').click(function(){ alert('Hello') });

Or you might be defining member functions for your classes:

className.prototype.setFoo = function(x){
  this._foo = x;
};

When you get a runtime error, the debugger displays a stack trace containing the names of the functions and the places where they were called. So, if you were using many anonymous functions, you get neck-deep into layers of “anonymous” on your stack trace. Sure, you can click on every single one of them to understand what is happening, but it is way faster to get that knowledge from looking at meaningful names.

Javascript allows you to give names to anonymous functions. In fact, this is how recursive functions can be defined as lambdas (a feature that OCaml lacks, for instance). So you can write the following and see it appear in a stack trace:

$('button').click(function _onButtonClick(){ alert('Hello') });

className.prototype.setFoo = function _className_setFoo(x){
  this._foo = x;
};

I start all names with underscores, because a simple /function _[A-Za-z0-9_]*/ in my preprocessor can identify and remove all those names when I don’t need them.

Tip #4 — Are your callbacks called?

The Javascript function model lets you write a function that forwards whatever it was doing to another function. You can even log some information along the way…

function trace(f) {
  return function () {
    log('%o.%s(%o)',this,name(f),arguments);
    return f.apply(this,arguments);
  }
}

The preprocessor instructions for conditional removal let you do this only in specific circumstances:

className.prototype.setFoo = /*[*/ trace /*]*/
(function _className_setFoo(x){
  this._foo = x;
});

It can also be quite practical to use a variant of the trace function that stores all functions it traces in an array, and removes them from the array the first time they are called. This can be useful when debugging calls to asynchronous functions that you write, because you might forget to call the callback function when you are done, and these bugs are otherwise quite hard to identify.

Tip #5 — Do not rely on closures too much

Closures in Javascript means you can write this, and have the inner function use the variable from the outer function :

function outer(a) {
  var b = a + 1;
  return function inner(c) {
    return b * c;
  }
}

Closures are extremely useful. However, Javascript has a nasty habit of creating variables at global scope whenever you try to write to a variable that doesn’t exist. For instance, the following function is not re-entrant because an accidental typo makes it use a global variable :

function person() {
  var aeg = 18;
  return {
    setAge : function(a) { age = a },
    canDrink : function() { return age > 20 }
  }
}

A good solution, I believe, is to rely on object members instead of variables. Inappropriately using an object member is harder than inappropriately using a local variable, because the former does not default to “look in the global scope”. Besides, every function beyond a certain level of inner state complexity deserves to be adapted to a more structured object style with its data as private members (if only because this is way easier to explore with a debugger).

Google Me!

What does a google search for my name tell people about me?

  • Obviously, I own this website.
  • I have LinkedIn and Facebook profiles.
  • I don’t really participate on Stack Overflow or Wikipedia a lot.
  • I wrote some articles on GameDev.net, which were repeated around on the web, translated to other languages, helped people, and were published in books.
  • Also, I have been participating on the GameDev.net forums for quite a while.
  • I wrote a Master’s Thesis [pdf] in finance in 2007 (namely, market microstructure and game theory).
  • I wrote another Master’s Thesis [pdf, french] in 2007, this time in computer science (GPGPU using CUDA, to be precise). I ended up presenting that work at Calyon (an investment bank) in early 2008.
  • Back in 2006, I wrote a course evaluation system for one of my schools (the Paris School of Economics). It worked, but the code is certainly not something I’m proud of.
  • Back in 2005, I worked on a grid computing framework in C#, NGrid.
  • Also in 2005, I was an intern [pdf, french] at Exalead on bayesian classification of web sites.
  • I play role-playing games and was the main contact for the RPG society at the Ecole Normale Supérieure. I also ran for treasurer [pdf, french] of the entire student association in 2006 (I was not elected).
  • I dabble in Objective Caml randomness and have some thoughts about programming language design.
  • I use Magento.
  • I work for Tangane (and wrote some stuff on their blog).
  • People find it funny when I mention condoms.
  • I used to be a TA in computer science, some of my old papers are still floating around and being linked to.
  • I once played Diplomacy. I’m not a very good player. :(
  • I regularly play Magic : the Gathering. I’m not a very good player either. :(
  • I was present at the Mobility Party 2004, back when I was a developer with int13 (working on Darklaga). I remember playing Half Life 2 for the first time at that event.
  • I signed a petition against Verisign. You sign one petition, and it remains online forever—I have been much more careful about what I do online since then.
  • I wrote small novellas in an even smaller literary society. Actually, I had started writing in college in a (parody) student newspaper [pdf, french].
  • When in school, I worked at the student help desk for windows users on my spare time.

All these are actual links you can find in a google search for “Victor Nicollet”.

If you look deeper (that is, using your brain instead of google), you could also find:

  • Extatica (page is down, though it’s still present on this list). A C++ 3D game engine I wrote back in 2003. Took the website down after I received a Cease & Desist from Extatica (my google ranking exceeded theirs). Ironically, on the latter page, there’s a link to a website where the DLL for my old engine can be downloaded.
  • N’improtequoi, an improvisational theater group. I am their webmaster, as well as a team member.

I don’t really know which is scarier—that so much information about me is available online, or that nobody seems to care about it? :)

Do It Yourself

Unless you’re working in an esoteric field on the bleeding edge of technology, the vast majority of programming problems you face have already been solved many times by many other people, and several of these solutions are readily available on the web or in legacy code libraries you might have access to.

To solve a problem, you can

  • reinvent a particular wheel : the non-factored approach, since you create your own instance of that wheel,  or
  • reuse one of its existing implementations : the factored approach, where several projects benefit from the same piece, including your own.

Both alternatives have costs and benefits that the experienced software engineer is aware of, and these will depend on your exact problem somewhere along the lines of :

1

The time spent solving a problem steadily increases with the size or difficulty of that problem, and is further subject to two important rules.

Non-factored is cheaper for small problems

A factored solution carries some overhead because it is used by several projects with different scopes. The “one click, 200 words” bias happens when non-technical managers hear “leverage an existing solution”, and see a picture of a one-click installer and a 200-word tutorial telling them their particular problem can be solved with two lines of C# code.

HolyGrail grail = new HolyGrail();
grail.doWhatIMean(/* No options here! ^_^ */);

Yeah. Riiiight.

Every one of us has spent days reading up on third party libraries just to decide if they are worth the effort, slaying compatibility dragons to make it talk with the rest of the project, filling hundreds of configuration options that have no relevance whatsoever to the tiny problem at hand, teaching co-workers about the nooks and crannies of that code, and painstakingly wading through less-than-civilized error reporting to solve the obtuse problems that come up on the day before you release.

Even writing your own reusable code is orders of magnitude harder than just jotting down a quick one-shot solution to whatever problem you have. An excessive tendency to build generic code from the very beginning makes your development process look like Dragon Ball Z : you have to power up for fifteen episodes before you can show a splash screen.

This rule is the reason why the red curve stays above the blue curve for small problems.

Factored scales better for large problems

Solving a larger problem involves a larger solution. In a do-it-yourself situation, you have to make the solution larger yourself. When using a factored approach, you already injected an existing large solution into your project, and it only feels small because you’re using a small part of it. With the programming equivalent of flipping a switches, you get to use a larger part.

The solution that involves the most code (the non-factored one, in case you wondered) also involves the most maintenance, documentation and development work. Whether this comes from a thousand-line reinvented wheel or obscene copy-pasting, having a large code base is something you will have to pay for in the long run. You don’t buy code, you rent it.

This rule is the reason why the red curve ends up above the blue curve for sufficiently large problems.

Keeping these two rules in mind, the key to making the right decision is determining where the red and blue curves intersect, and where your project stands. Easier said than done. For instance, what does “problem size” mean, precisely?

Problem size can be, literally, the size of the problem for an obvious metric. A content distribution network like Amazon S3 is a bad choice for 1000 downloads per week, but an obvious solution for 1000 downloads per second.

Could be the things in the application that are similar to the one you’re implementing. Sending usage statistics back to your server is a small problem solved with a vanilla HTTP request. If you communicate with the server a lot, you might want to keep the URL and error handling logic together in one place.

Or it could be the number of features. Displaying data in table format takes two nested loops and some HTML. Sorting, filtering, asynchronous sending or editing involves some rather smart Javascript development, or integrating a tool like jqGrid or ExtJS.

Once, Twice, Refactor

The special case of writing your own reusable code has been “solved” by Agile folks who suggest writing a non-reusable version of the code on the first try, and refactoring it to a reusable version the second time it’s needed. This is your third choice : go with the non-factored solution if you are unsure whether the problem is large enough to warrant the factored solution, and change your mind as soon as you gather enough data.

2

This is a solution that costs less than the factored approach if the problem is small, and costs less than the non-factored solution if the problem is large, while keeping an acceptable overhead when the problem is somewhere in-between.

Of course, writing your own reusable code means that the cost of switching from the non-factored to the factored version is significantly lower than starting with the non-factored version from scratch, because you refactor the original solution into a reusable one.

The advantages are not so obvious when moving from one approach to the other involves throwing away all code and installing a third party application. You do get some benefits—at the very least, you know more about the problem that you did at first, and perhaps your first approach served as a useful prototype to further refine your needs—but doing this can hurt a lot.

So, you end up getting hurt if you don’t know what you’re doing. What a surprise.