Empty Lists

We have all written this code before :

<ul>
  <?php foreach ($list as $element):?>
    <li><?=htmlspecialchars($element)?></li>
  <?php endforeach; ?>
</ul>

What happens when the list is empty? What is generated is an empty UL element :

<ul></ul>

This would be perfectly fine, if it wasn’t completely wrong. Quoth the XHTML DTDs (any of them) :

<!ELEMENT ul (li)+>

There must always be at least one list item in a list (what kind of insanity would have led to preventing empty lists from existing is beyond me, although I’m certain they must have had a good reason), which means a document will not validate if it contains the aforementioned empty UL element. This is also the case for HTML 4, though HTML 5 does currently allow empty lists.

So, to circumvent the empty list case, the code becomes:

<?php if (count($list) > 0): ?>
  <ul>
    <?php foreach ($list as $element): ?>
      <li><?=htmlspecialchars($element)?></li>
    <?php endforeach; ?>
  </ul>
<?php endif; ?>

While it might be possible to abstract these details away behind a function that prints a list of elements, the ultimate point of such an abstraction would be to free the developer’s mind of the issue of empty lists not being allowed in XHTML. And such a thing would be ill advised : since the correct behavior is to remove the empty list from the document, the developer should be aware that no UL element will be generated for an empty list, especially since this has implications on the CSS side (which has to accomodate the absence of the list) and the Javascript side (which has to create the element if it doesn’t exist before adding elements to it).

An important quality of any developer is their ability to identify and handle any corner cases of their domain. An important quality of any domain is to have as few corner cases as possible.

Semantic & Symbolic

Our brain interprets variables in two different fashions : semantic and symbolic. Semantic understanding means understanding the meaning of the variable’s name, where long and detailed names like “newCustomers” convey information in plain english and shorter names like “i” and “x” convey information through conventions. Symbolic understanding relies on recognizing the shape of the variable in several locations in code, and deducing from there its actual meaning—in itself, the name is not relevant, your brain just goes “hey, that’s the same variable“. Of course, there is some amount of semantic recognition to symbolic variables, usually because we understand the variable as being a symbol.

Mathematics make much use of symbolic recognition. After all, mathematicians do not write f(number) = 10 × number, they write f(x) = 10x and while one-letter variables do have some amount of semantics associated to them (i,j,k,m,n are integers, x,y,z are reals, p is a prime number, q and r are rationals, f,g,h are functions, t is often seen as time, d is a divisor, P is a predicate or a polynomial) this minimal amount of information is ridiculous when compared to the huge amounts of purely symbolic information one gathers from the use of the letter.

Symbolic recognition works better when the expression is small. In descending order of readability when you’re familiar with the language,

Mathematical notation:

ƒ : A → ∃n∈A. 2|n

Objective Caml:

let f(a) = List.exists (fun n -> n mod 2 = 0) a

C++:

bool f(const std::vector<int> &a) {
  for(std::vector<int>::const_iterator it = a.begin(); it != a.end(); ++it)
    if (*it % 2 == 0) return true;
  return false;
}

I guess there are two things one can learn from this.

First, in a terse language, symbolic recognition works better, which in turns means the programs can be even more terse while retaining their understandability.

Second, don’t bother with long variable names in a two-line function if all the information present in the name can be readily and easily deduced from the two lines in the function.

JavaScript Component Tutorial

Earlier this year, I ranted about how the graceful degradation model of jQuery made it hard to create complex components. Also, while working with a team on JavaScript components, I had to review all my previous takes on JavaScript architecture in order to build conventions that an entire team can follow.

Namespacing

To avoid collisions with other libraries, I create an object that uses a name I own. Any namespace name strategy is possible here, from java-like netNicolletCheese (if you have a common project name) to just cheese (if you have a project name that’s fairly unique). Then, any code I write goes into that namespace. I may further add sub-namespaces if I have a lot of code. Either way, you have to make sure the namespace exists before adding things to it, thus I add at the top of every file:

if(!('netNicolletCheese' in this)) this.netNicolletCheese = {};

The basic idea is that since I don’t know what order my files will be defined in, I have to define the namespace in every single one, while avoiding redefinition. This way, I can include files on an if-needed basis or stick them all together and remove any occurences of the namespace line except the first.

Then, everything is defined as members of that object. Executing any kind of code in library files is forbidden, only function and object definitions are allowed.

Components

A component is a class that contains data and renders itself somewhere on the page. This is different from the jQuery model of graceful degradation that assumes the rendered data is already present on the page and merely changes its layout. Use with caution, since this loses many benefits of graceful degradation like accessibility or search engine friendliness.

A component is always created as follows:

instance = new namespace.component(selector, data, options);
  • It’s always assigned to a variable. It’s a global variable if it’s defined at global scope (obviously, this may only happen in the code on a page, not in library code), and a public member variable of another object if it’s defined within an object. There are no free-floating components, every single one must be accessible from global scope as this makes command-line debugging way easier, and keeps the structure easier to see.
  • It has a first argument, which is a selector (in the jQuery sense). It will be fed to $(…) in order to get the target elements of the component (usually a single one). The typical behavior of a component is to generate some HTML from its internal state and call $(selector).html(…) to display the HTML. The selector is evaluated when the constructor is called, which means you may have to wrap the object initialization in a $(document).ready(…) to wait for the DOM to be instantiated. It also means adding any elements matching the selector later on won’t have any effect on the component.
  • It has a second argument, which is the data used to initialized the component. For instance, if the component is intended to display a list of elements, the data argument would be that list en JSON notation. This makes it easy to generate that data on the server side using one of the many JSON generators, while also making the component easy to instance on the client side programmatically.
  • It has an optional third argument, which represents the options that one may provide the component with (such as width, height, speed, effects, and so on). If it’s not part of the main data argument, it’s part of the options. The options are a classic JS record.

Component Initialization

The component is instantiated either when the document is ready, by placing the initialization code in the appropriate event, such as :

var page = {};
$(function(){ page.instance = new namespace.component(selector, data, options) });

Or it can be instantiated inside another component an an appropriate time.

The constructor itself consists of two distinct operations :

  • Set up any member variables representing the internal object state, using the data argument and options argument.
  • Render the object so that it appears on the page, using the rendering function, and passing the selector to it:
    this.render(selector);

    Note that a component may be created without a target selector, simply by using an empty array as the selector. It will remain unrendered until its render function is manually called with a valid selector as its argument.

Component Rendering

The render function is called during initialization. It’s also called whenever the entire component needs to be redrawn. Some components are small, and are redrawn every time, while other components may choose to only redraw parts of their contents and may therefore use other rendering functions for those parts. The rendering function reliably performs up to six operations:

  • It initializes the target, if it was provided. This lets the calling code change the rendering target dynamically.
    if (typeof(selector) != "undefined") this.$target = $(selector);

    This is generally useful when a component contains other components : a full rendering of the container means the target DOM elements of the inner components have been destroyed and created anew, and the container must therefore notify the inner components that they have a new target to render to.

    Note that the name of the target is always the same: for any component, component.$target is the current target of the component.

  • It optionally determines whether there is a target to begin with, to avoid unnecessary work. This usually takes the form :
    if (this.$target.get().length == 0) return;

    In the case where a component is inside a container, the container will create the component before rendering itself (to make things simpler, rendering assumes all sub-components already exist), and therefore provide an empty array as the selector.

  • It generates the full HTML for the component as a string.
  • It inserts the HTML into the DOM, replacing anything that previously existed. This usually happens as:
    this.$target.html(theGeneratedHtml);
  • It changes the rendering target of any sub-components and tells them to render themselves, usually written by extracting the correct targets from its own target and reverting it to an array of DOM elements:
    this.subComponent.render(this.$target.find('.subComponent').get());
  • It sets up any relevant events on the generated DOM. For instance, if the generated HTML contains a button, the button’s click event may be set to an event handler:
    this.$target.find('button').click(this.onButtonClick)

Component Event Handlers

It would be easy to define the “on button click” event simply as follows:

namespace.component.prototype.onButtonClick = function()
{ this.data.frobnicate(); }

But that wouldn’t work with jQuery, since the events re-bind the ‘this’ variable on the event handler before calling it. Meaning ‘this’ would be, in this case, the button DOM element instead of our component. This is bad.

The solution is to create an anonymous function that forwards the call to the appropriate member function:

this.$target.find('button').click(function(){this.onButtonClick()})

Whoops. ‘this’ doesn’t follow lexical scoping, which means this code still has the same problem. However, this can be solved quite easily:

var self = this;
this.$target.find('button').click(function(){self.onButtonClick()})

A short example

We can write a short incrementer: a button with a number that increases every time the button is pressed.

// Create the namespace if it doesn't exist
if (!('netNicollet' in this)) this.netNicollet = {};

// The constructor for our component
netNicollet.counter = function(selector, initial)
{
  // Set up data members (only one)
  this.value = initial;

  // Render the component
  this.render(selector);
}

// The rendering function
netNicollet.counter.prototype.render = function(selector)
{
  // Change the target (if applicable)
  if (typeof selector != "undefined")
    this.$target = $(selector);

  // Early-out if no target
  if (this.$target.get().length == 0)
    return;

  // Generate the HTML
  var html = '<div>' +  this.value + '</div>'
    + '<button type="button">Increment</button>';

  // Insert the HTML into the DOM
  this.$target.html(html);

  // Set up the events
  var self = this;
  this.$target.find('button')
    .click(function(){self.increment()});
}

// The increment operation
netNicollet.counter.prototype.increment = function()
{
  // Change the state
  this.value++;

  // Update the graphics
  this.render();
}

// Call this once the document is ready.
var counter = new netNicollet.counter('body', 1337);

Scanners!

Every piece of (useful) snail mail I receive is scanned and stored both on my computer and on a remote backup server. The scanner itself cost me around 50€ (it’s a Canon LIDE 50 of which I am quite happy, especially since it is perfectly compatible with the latest SANE libraries on my Ubuntu). In order to improve the efficiency of the process in terms of time I have to waste doing things, I have written a short script that interacts with sane (for scanning) and ncftp (for uploading to the backup server) and lets me enter elementary information on the command line.

Here is the code:

let prompt s =
  print_string s ;
  flush stdout

let make_directory dirname =
  let command = "mkdir " ^ dirname in
    0 = Sys.command command

let base_path = "/home/arkadir/docs/"

let scan_command =
  "scanimage -l 0 -t 0 -x 215 -y 297 --brightness -22 --contrast 22 --resolutio\
n 300 --progress --mode Gray --format=tiff 2> /dev/null"

let scan_to_file filename =
  let command = scan_command ^ " | convert tiff:- " ^ filename in
    0 = Sys.command command

let scan_files base =
  let rec aux i =
    let filename = base ^ "/page" ^ string_of_int i ^ ".png" in
      if scan_to_file filename then
        begin
          ignore (Sys.command ("display " ^ filename));
          prompt "Scan successful. Enter any string to continue, nothing to stop. " ;
          let line = read_line () in
            if line <> "" then aux (i+1)
        end
      else
        begin
          prompt "Scan FAILED. Enter any string to retry. " ;
          let line = read_line () in
            if line <> "" then aux i
        end
  in aux 1

let rec upload_files dirname =
  print_endline "Uploading files..." ;
  let command =
    "ncftpput -R -f "^base_path^"ftp.cfg scans "^dirname
  in
    if 0 <> Sys.command command then begin
      prompt "Upload FAILED! Enter any string to retry. " ;
      let line = read_line () in
        if line <> "" then upload_files dirname
    end

let process =
  prompt "Document name: " ;
  let line = read_line () in
    if line = "" then print_endline "No filename entered, aborting."
    else
      let dirname = base_path ^ line in
        if not (make_directory dirname) then
          print_endline ("Could not create directory " ^ dirname ^ ", aborting.")
        else begin
          scan_files dirname;
          upload_files dirname
        end

So far, I’m keeping the data as high-resolution PNG files, which means about 8MiB for every file. I will be moving to the DjVu compression format as soon as possible, and update my script accordingly.

Bored CSS

I had some free time on my lunch hour today, so I decided to answer a plea for help on the GameDev.net forums.

I am absolutely horrible with CSS. I need something to launch my site with (server is not available ATM)

If you can make this page look presentable i’ll be very happy. Please build on each other work so instead of msging me paste a link in the thread so everyone can see what you done and hopefully make it better.

It took me a few minutes to review the page structure, think of a classic left-right page structure, and write the corresponding CSS. I didn’t have a lot of time, so I couldn’t make the stylesheet fully portable (for instance, the rounded corners only work in Firefox, and some CSS selectors seem to confuse Internet Explorer) thus illustrating the classic conundrum that designing the stylesheet is 99% of the work, and making the stylesheet work across all browsers is the remaining 99% of the work.

And then there’s the 99% of adding jQuery to the page ;)

You can check out the re-designed page, or look at this screenshot to see what it looks like in my FireFox:

PHP Autoloading

Like C, PHP initially started out as a “every file defines functions and variables and classes” language where using an entity assumed that it had already been defined (which, in practice, meant that the file it was defined in had already been included).

This led to several issues :

  • It was hard to find out what file contained what function. It was certainly possible to namespace functions based on the file name, but it required more effort than the amateur team workforce was capable of, and it made function names so much longer.
  • It was easy to mess things up when doing dynamic loading, because one could mistakenly load a dangerous or private file.
  • When serializing classes, one would have to determine where the class was defined when reloading the serialized data, so that the class definition could be loaded again.
  • Every time a class or function was used, the developer would have to check that the corresponding definition file was loaded as well. This led to loading many files that were not necessary just in case they would be used. Since PHP is not compiled, this meant parsing the files and populating the global scope with unnecessary entities.

Which is why autoloading was introduced.

The mechanism behind autoloading is simple : if at any point during the execution of a program the script uses a class that is not defined, the __autoload function is called with the name of that class as an argument. That function is then allowed to load a file or evaluate a script string in order to define that class.

The function obviously determines, using the class name, what source file defines that class, and loads it just in time for the class to be used. This solves all of the above issues in one strike:

  • There’s usually a clean convention for mapping class names to files. For instance, the Zend convention is that class Foo_Bar_Qux is defined in Foo/Bar/Qux.php within the include path. And if you don’t follow the convention, the code doesn’t work (of course, there’s still the issue of writing the code on Windows and then running into Linux case sensitivity).
  • Using Zend_Loader (or writing your own sane __autoload function) you can restrict dynamic loading to a single directory.
  • __autoload also triggers while deserializing.
  • Developers don’t need to include anything : every used class is included, and no class is included unless it’s used.

There is of course a slight performance penalty as the loader has to process the class name to find out what file to load, but bytecode caches work around this issue quite well when performance is important.

Server-Side JavaScript

The more you work on the web these days, the more JavaScript you have to write. Making pages dynamic with one toolkit or another has become a staple of the web developer ecosystem, and it leads to two classic issues:

  • It’s annoying to have two languages communicate across the server/client barrier. It would certainly be easier to use the same language on both, especially if it allows you to do nifty things like passing closures from the server to the client or the other way around. And you can reuse the code.
  • While quite lacking in terms of server-side libraries, EcmaScript is a very consistent, concise and powerful language that supports an elegant function model and an elegant object model. If you already have JavaScript programmers on your team it would be a waste not to use the language for the server side as well.

It looks like there are other people around who think like me. And, to avoid the PHP fiasco where a single team hardcoded some very nasty habits into the language, they started a discussion group to make those decisions ahead of time. I will definitely be joining—let the functional programming invasion continue :)

The Truth of JavaScript

nFriedly posted an interesting refresher on JavaScript boolean evaluation of non-boolean values : Logical Operators and Truthy / Falsy.

It’s worth mentioning that JavaScript evaluates empty arrays as “true”, whereas PHP evaluates them as “false”.

Engrish

It’s virtually impossible to visit every page of a website (well, except for very small websites). And until you try to visit a page or use a feature, you don’t know whether that feature works or not—that’s why testing software in general is so hard. You can’t really know if you’re visiting a cardboard town until you’ve visited everything.

For instance, there’s an MP3 download website called gomar-krakow.com (I’m not making the link clickable) that looks quite professional. That is, until you read the privacy policy :

The Way, what We Use This Information:

We use return email addresses to answer email, what we become. Email Addresses and other given did not collect the wide-spread third party. The Mask of the methods, using to guarantee that your email address is not displayed in clear text within pages of our site.

The School Music Forte is used to totalize, anonymous given for prepare to efficiency website and Music Forte Schools, marketed measures. The School Music Forte does not use such anonymous information for any other integer.

We never use or spread personally identifiable information provided us online in not having relations fetter on this described above.

Our Obligation In Safe Data:

To prevent the unauthorized access, support the accuracy data and guarantee correct use in information, he has reaching procedures to protect and provide information, what we collect online.

As that Address on Us:

If You have other questions or enxiety about this policy of secrecy, please send email to.

The Notice of Change:

The School Music Forte can modify this Politician Secrecy without notice anytime.

The philosophical implications of their poetry are quite moving. And the terms of use page is even worse:

Notice: Trying to get property of non-object in /usr/home/gomar/domains/domain/public_html/templates/mp3-archive/static.tpl on line 3

Notice: Trying to get property of non-object in /usr/home/gomar/domains/domain/public_html/application/modules/content/controllers/IndexController.php on line 46

Notice: Trying to get property of non-object in /usr/home/gomar/domains/domain/public_html/application/modules/content/controllers/IndexController.php on line 47

Notice: Trying to get property of non-object in /usr/home/gomar/domains/domain/public_html/application/modules/content/controllers/IndexController.php on line 47

It seems pretty obvious that those nice shiny “Terms of use” and “Privacy policy” links one very single page are there to provide the illusion of a professional website, even if clicking on them dispels that illusion quite fast.

The same happens when you try to download a product, as a captcha comes up and never registers your input. The whole deal looks suspiciously like a way to extract captcha recognition from clueless humans without anything in return.

As a web designer, recognize that humans are intrinsically shallow, and make sure that you provide the impression of a complete and professional website—just packing functionality together isn’t enough, you have to make it look complete.

As an user, realize that most people who have something to sell already know that cognitive flaw of yours and exploit it to the fullest, and make sure to explore everything just a little bit further to look for inconsistencies or missing pieces before you commit money to a product.

Jamin-Puech

Jamin-Puech is a French company that designs and sells handbags and jewelry worldwide. Still, people couldn’t buy these handbags online, because the brand had no e-commerce website.

This is where I come in: I was the technical lead of the development team brought together by my employer, Tangane, to build a new e-commerce website for Jamin-Puech. It consists mostly of a custom-skinned Magento website with some additional development in various functionality areas.

The result has been online for a short while now. If you’re interested in handbags:

http://www.jamin-puech.com/eboutique/