Tag Archive for 'Objective Caml'

OCaml Submodule Pattern

My code is quite large for an OCaml project. The main RunOrg repository alone contains 46212 lines of OCaml code (plus an additional 5631 lines of OCaml mli files) — and then, there’s the web framework code and the independent plugins code.

It’s is Better™ to have many short files than a few long ones. One reason is incremental compiling with ocamlbuild : that the smaller your files are, the smaller the percentage of code to be compiled when you make a small change. Another reason is that files provide a natural delineation of code that makes it slightly easier to reason about.

The very process of splitting a large file into smaller files is also an excellent way to clean up the code. Every split is an opportunity to move some code to a more generic location — why have a CMember_importParser module when all of its functionality could fit into an OzCsv plugin module ? Even when no such generic solution exists, cutting through the jungle that a 2000-line module contains helps clean up dependencies, identify shared functionality and imagine better ways to design code.

Still, when cutting up code this way, the problem of encapsulation remains. If code that relates to pictures (an upload module, a transform module, a download module, an access rights module) is split across several files, it is desirable to let each file access functions and values from other values that would not otherwise be shown to modules not related to picture processing. For instance, a get_download_link function should be available throughout all picture-related modules, but the rest of the application should use the get_download_link_for_user function that checks whether the user is allowed to download the file.

In order to achieve several nested levels of encapsulation required to work with modules this way, I have come up with a convention :

  • A module name (and thus, a file name) is composed of segments written in camelCase and separated by underscores. For instance, CEntity_view_grid is a module name containing segments CEntity, view and grid.
  • Modules with only one segment are public. Any other module may include, open or otherwise reference them with no limitations beyond what the module signature says. So, CEntity may access MGroup freely.
  • Modules with N > 1 segments are private. They may only be accessed by modules which share the first N-1 segments. So, CEntity_view is available to modules CEntity and CEntity_edit but not CPicture.
  • A module with N segments may export any module with N+1 segments it can access, possibly under a more restrictive signature. For instance, CEntity_view is available to all other modules as CEntity.View.

To make these rules easier to respect, private module dependencies are made explicit by adding a list of module aliases at the top of each file. The top of my cEntity_view.ml file starts with :

module Sidebar     = CEntity_sidebar
module Unavailable = CEntity_unavailable
module Edit        = CEntity_edit
module Info        = CEntity_view_info
module Directory   = CEntity_view_directory
module Grid        = CEntity_view_grid
module Wall        = CEntity_view_wall

It is forbidden to use a private module without going through such an alias, and it is forbidden to define such an alias anywhere except at the top of the file. This makes it extremely easy to determine whether private access rules are respected.

The rule of thumb for splitting files (in my particular coding style) is :

  • Code for separate layers (model, view, controller…) go into separate public modules.
  • For complex code (such as complex rules in model or controller code), consider splitting files larger than 200 lines.
  • For simple code (such as HTML template or JSON serialization definitions), there is no splitting limit except for factoring out common behavior.

Two-way bindings

Quick : how would you design a function that opens a file handle that auto-closes whenever execution leaves a certain scope, even if an exception happens ?

The C++ solution is quite straightforward : have a destructor that closes the file handle, and create the handle as an auto variable:

std::ofstream out = std::ofstream("out.txt");
out << whatever();
// file handle is always closed by the language

On the other hand, OCaml does not have destructors, but can simulate the RAII spirit using a closure: you provide a function that will be called with the file handle as its argument, and the file handle will be destroyed when the function returns or raises an exception.

BatFile.with_file_out "out.txt" begin fun out ->
  BatIO.nwrite out (whatever ())
  (* the file handle is always closed by the language *)
end

This is a special case of a more general principle.

Two-way binding

The standard let keyword performs a one-way binding: bind value to variable then evaluate expression. Two-way binding adds a post-processing step : when you’re done with the expression, do something else. Such a behavior has important consequences for writing concise and readable code.

In my OCaml code, two-way binding is performed with keyword let! that is preprocessed as follows :

let! pattern = value in expression
(* Is translated to *)
value (fun pattern -> expression)

For instance, the above file manipulation script would be written as:

let! out = BatFile.with_file_out "out.txt" in
BatIo.nwrite out (whatever ())

This syntax expresses the actual intent of the code better than the anonymous callback syntax did: bind the file handle to this variable, but don’t forget the post-processing steps.

Here are a few more examples of situations that may be improved by this syntax :

Events and reactive programming

Reactive programs can be constructed either using the typical “register this function to be called whenever this value changes or this event happens” semantics, or  by using binding semantics instead:

let () =
  let! user = User.on_change (#last_login) in
  if user # notify_login then
    Mail.send (user # email)
      ("Someone has logged in to your account at " ^ datetime (user # last_login))

The underlying signature of User.on_change (which registers a listener callback and returns unit) remains the same.

Retry semantics

CouchDB implements transactions with retry semantics: you read a document, compute some changes and try saving them back, and if the document was changed by someone else in the mean time, you will have to try again. It makes sense for the code inside the transaction to be 1° idempotent and 2° wrapped away in a function that 3° takes the latest version of the document as an argument :

let set_title article_id new_title =
  let! article = Database.transaction article_id in
  Database.write article_id { article with title = new_title }

In such a design, the write function would throw a specific exception if a collision occurs, and the transaction function would intercept that exception and try again until the transaction succeeded or a maximum number of retries happened.

Monads

Value binding in monads benefits from having a syntax that actually looks like binding.With the option monad, one can turn this :

match Files.get file_id with None -> None | Some file ->
  match file # owner with None -> None | Some user_id ->
    match Users.get user_id with None -> None | Some user ->
      Some (user # name)

Into a more straightforward version :

let  open BatOption.Monad in
let! file    = bind $ Files.get file_id in
let! user_id = bind $ file # owner in
let! user    = bind $ Users.get user_id in
return (user # name)

Also, one can deal with Lwt threads almost as well as the Lwt-specific syntax extension:

open Lwt
open Lwt_io

let process_lines channel process =
  let loop () =
    let! line_opt = bind $ read_line_opt channel in
    match line_opt with
      | None -> return ()
      | Some line -> loop () <&> process line
  in
  loop ()

Being Silly

let fold init list f = List.fold_left (fun acc x -> f (acc,x)) init list
let map list f       = List.map f list

let probabilities odds =
  let sum =
    let! accumulator, odd = fold 0. odds in
    accumulator +. float_of_int odd
  in
  let! odd = map odds in
  float_of_int odd /. sum

The Syntax Extension

In case you don’t know how to create it, this is the preprocessor file for this syntax extension :

open Camlp4.PreCast
open Syntax

EXTEND Gram
 GLOBAL: expr;

 expr: LEVEL "top"
 [
   [ "let"; "!"; p = patt ; "=" ; e = expr ; "in" ; e' = expr ->
     <:expr< (($e$) (fun $p$ -> $e'$)) >> ]
 ] ;

END;

By the way, I find that this extension has a significant advantage over the Lwt extension – it is readily compatible with syntax highlighting in most editors.

Basic Patterns for Everyday Programming

Lakshen Perera provides a list of basic patterns for everyday programming, illustrated in Javascript and Ruby. I thought it would be interesting to provide an OCaml illustration as well, and perhaps a handful of additional patterns as well.

Verify object’s availability before calling its methods or properties

In many languages, there is a possibility for objects to be missing — whether this is represented as NULL, null, nil or None. Regardless of the language, it is important to keep in mind at all times whether a given value is optional or not. If it is not optional, then you may assert so using a type declaration if your language supports it :

method name  : string (* not optional *)
method email : string option (* optional *)

In other languages, a runtime assert will do, for instance in PHP :

assert (isset($this->name))

OCaml will not require you to explicitly declare all values as optional or mandatory. Instead, it will deduce that information from the way you use each value. For instance, since json_of_string expects a non-optional string argument, you can simply write :

let parsed_content = json_of_string json (* json is not optional *)

If json is an optional string, this will create a compilation error unless you explicitly define what should happen when the string is missing. The most common approach is to decide that the result should be missing too :

let parsed_content = match json with
  | None   -> None
  | Some s -> Some (json_of_string s) (* by definition, s is not optional *)

If using the Batteries library (which, by the way, you should), you can express this more easily :

let parsed_content = BatOption.map json_of_string json

Set a default value with assignments

This advice is almost identical to the previous one : if you need to construct a non-optional value, but only have access to an optional value, you will have to provide a default value. Using standard OCaml :

let role = match person.role with
  | None      -> `Guest
  | Some role -> role

Using Batteries, there is an almost equivalent version :

let role = BatOption.default `Guest person.role

This version is almost equivalent, because it will evaluate the default value even if it is not required. Let’s assume that the default value is itself provided by a complex computation, such as a database access :

let role = match person.role with
  | None      -> readfrom database (* only executed if person.role is missing *)
  | Some role -> role 

let role = BatOption.default
  (readfrom database) (* Always executed, even if unnecessary *)
  role

The Batteries library provides an alternate function for that :

let role = BatOption.map_default
  readfrom database (* only executed if person.role is missing *)
  role

Checking whether a variable equals to any of the given values

If the values have a legitimate reason to be strings, integers or other types with unlimited numbers of values, then « is in list » predicates are the preferred choice :

if List.mem current_day ["Monday";"Wednesday";"Friday"] then
  (* do something *) 

Days are a quite bad example, because these are better represented as a variant (which then type-checks whether you have written Mornday instead of Monday). If you have already defined the type of weekdays, then :

if List.mem (current_day : weekday) [`Monday;`Wednesday;`Friday] then
  (* do something *)

If this is a one-shot check, or if you would rather not define a weekday type yet, you should instead go for an exhaustive pattern-matching :

 match current_day with
  | `Monday | `Wednesday | `Friday ->
    (* do something *)
  | `Tuesday | `Thursday | `Saturday | `Sunday -> ()

Extract complex or repeated logic into functions

This is a fairly fundamental concept — but it consists in two distinct parts. There is a separation and naming part (you pull out a piece of code and give it a name, which helps understand what it does and how it relates to the rest of the program) and there is an extraction and reuse part (the piece of code is pulled out into a more globally accessible location and parametrized, so that it may be used in other places).

With the above example, simple separation-and-naming would be :

let is_discount_day = match current_day with
 | `Monday | `Wednesday | `Friday -> current_date > 20
 | `Tuesday | `Thursday | `Saturday | `Sunday -> false
in

if is_discount_day then 
  (* do something *)

The variable is defined in the same scope it is use in, and it assumes that current_day and current_date values have been defined previously in that scope. Extraction-and-reuse would go further :

type weekday =
  [ `Monday | `Tuesday | `Wednesday
  | `Thursday | `Friday | `Saturday | `Sunday ]

let is_discount_day (day:weekday) date =
  List.mem day [`Monday;`Wednesday;`Friday] && date > 20

...

  if is_discount_day current_day current_date then 
    (* do something *)

Now, is-discount-day is a global function available from everywhere in the code, and it uses the provided parameters to determine whether this is indeed a discount day.

Memoize the results of repeated function calls

OCaml has several ways to perform memoization. One of them is lazy evaluation :

val discount_day = lazy (is_discount_day current_day current_date)
method discount_day = Lazy.force discount_day

The lazy expression will only be evaluated the first time the Lazy.force function is called on it.

Note that if the current day or current date can change, then the memoization actually breaks things !

Memoization is also helpful when dealing with a function that requires arguments, in which case a different result will be provided for each argument set. A common solution is to use a hash table to store these :

let fibonacci =
  let memo = Hashtbl.create 100 in
  let rec fib n =
    try Hashtbl.find memo n with Not_found ->
      let result = fib (n-1) + fib (n-2) in
      Hashtbl.add memo n result ; result
  in fib

This works fine for short-lived functions — don’t do this for global functions that might stick around for a long time, because the memoization hash table will grow and its contents will never be garbage-collected. If you really have to, use a weak hash table, such as Batteries’ BatInnerWeaktbl, so that the garbage collector may reclaim the memoized values when it runs out of memory.

Also don’ t overdo memoization — it only works when arguments are reliably passed more than once and the time to compute the value is significantly larger than the time to retrieve and store it and it is worth the memory usage and the function has no side-effects.

Use the seven list manipulation primitives

Almost any processing on collections of items can be expressed in terms of seven fundamental patterns. Recognizing those patterns can help improve the clarity of both the code and the underlying algorithm.

1. Map transforms a list into another, item by item, in linear time. Use a map operation when all you need is a one-to-one transformation. The line below extracts three recipes from the database using their identifier.

let recipes = List.map from_database [ "omelet" ; "cheeseburger" ; "risotto" ]

2. Reduce transforms a list of values into a single value by repeatedly applying a function that combines together two values into one. The typical example is a fold, which uses a function to combine each list element, in turn, with an accumulator. It can be used to extract the sum of values in a list, for example :

let total = List.fold_left (+) 0 [ 5 ; 6 ; 3 ; 8 ; 9 ; 0 ; 7 ; 6 ]

This transform allows a preliminary map step which transform the values inside the list into values that can be combined. For instance, to find the age of the oldest person in a list of people :

let oldest = List.fold_left (fun acc person -> max age person.age) 0 people

3. Extract works like a map, but the transformation function returns zero, one or more results for each call. All the results are included in the final list. The most elementary implementation is literally to have a map (that transforms a list into a list of lists) followed by a concatenation (that transforms a list of lists into a list). For instance, to get all the ingredients involved in a list of recipes :

let ingredients =
  List.concat (List.map (fun recipe -> recipe.ingredients) recipes)

4. Filter is a subset of Extract where the transform may not return more than one result — but it may still return none, so its result is simply an optional type. For instance, to extract the list of all recipes that have a wine recommandation along with their recommended wine :

let wines = BatList.filter_map
  (fun recipe -> BatOption.map (fun wine -> recipe,wine) recipe.wine) recipes

The OCaml language also provides a standard List.filter function which keeps values for which a property is true. For instance, to get the list of recipes that have a wine recommendation :

let have_wines = List.filter (fun recipe -> recipe.wine <> None) recipes

This approach is weaker — the wines in the resulting list are still treated as optional by the type system, so you will need bogus pattern matching for a case that never happens (no wine) to extract the actual wine values. A filter_map lets you encode the filter property in the type of the result, which makes using the filtered list easier.

5. Sort unsurprisingly sorts the list. The canonical sort — using the canonical order relationship — is a theoretical curiosity, and in practice most sorts use a projection function p such that A < B iff p(A) < p(B). This is best illustrated in SQL, in the form of the ORDER BY <projection> statement. Two useful helper functions :

let project compare p a b = compare (p a) (p b)
let descending compare a b = compare b a

For instance, to sort the list of recipes based on how long each of them takes :

let by_duration =
  List.sort (project compare (fun recipe -> recipe.duration)) recipes

6. Group works like Sort, but further regroups « equal » items together by returning a list of lists. It works using a comparison function that is usually based on a projection function. For instance, to get three lists containing one-star, two-star and three-star recipes :

let by_stars =
  BatList.group (project compare (fun recipe -> recipe.stars)) recipes

There is a special case when the projection function returns booleans, which is known as a partition. For instance, to extract recipes that are desserts and recipes that are not :

let desserts, non_desserts =
  BatList.partition (fun recipe -> recipe.is_dessert) recipes

7. Search extracts one element from the list (if possible) based on a certain condition or property. Elementary searches are « first element » and « last element. » More complex searches : finding a value by key in a list of key-value pairs using List.assoc or using a predicate using List.find. The heavy-duty search tool is BatList.find_map, used below to find a recipe that is recommended by at least one person, and the recommending person :

let recipe, recommender = BatList.find_map
  (fun recipe -> BatOption.map (fun person -> recipe, person) recipe.recommended)
  recipes

These seven patterns can be used in conjunction to perform almost any algorithm on collections, sequences or lists. For a more complex example, assume we need the ten ingredients that appear the most often in desserts. We would filter recipes by desserts, extract their ingredients, group them by name, sort the sub-lists by list length and take the name of the first element of the first ten sub-lists :

open BatPervasives (* for operator |> *)
open BatList

let ten_best_ingredients recipes = recipes
  |> filter (fun r -> r.dessert)
  |> map (fun r -> r.ingredients) |> concat
  |> group (project compare (fun i -> i.name))
  |> sort (descending (project compare length))
  |> take 10
  |> map hd
  |> map (fun i -> i.name)

Article © brewbooks — Flickr

Ozone Templating

There have been recurring requests about an in-depth explanation of how Ozone — our in-house OCaml web framework — handles HTML templates. So, here it is.

A template is usually understood by everyone to be « HTML with holes » that is filled using values from the application itself. It is, in a sense, a DSL that is restricted to describing how HTML should be built.

Here is an example of template Ozone could use, stored in file users.htm :

<h1>{t:users.title}</h1>
<ul class="userlist">
  {{list:
    <li id="{v:id}">
      <img src="{v:img}"/>
      <a href="{v:url}">{v:name}</a>
    </li>
  }}
</ul>

<script type="text/coffeescript" params="save_url">
list = @$.find 'ul'

save = =>
  ids = [];
  list.children('li').each ->
    ids.push $(@).attr 'id'
  @ajax save_url, ids

list.children('li').sortable
  change: save
</script>

<style type="less">
.userlist {
  list-style-type: none;
  li img {
    float: left;
    margin-right: 5px;
    width: 50px;
    height: 50px;
  }
}
</style>  

The template for a sortable list of users contains three things:

  • A piece of HTML, which is the actual « HTML with holes » to be filled later. The holes are marked in orange.
  • A piece of CoffeeScript, which will be extracted from the template file, compiled to javascript and appended to a site-wide javascript file. It will be replaced, in the template, by a hole that will call the extracted javascript with additional parameters provided by the application (in orange).
  • A piece of LESS CSS, which is compiled to CSS and appended to a site-wide CSS file.

These are not sections — they can appear in any order as long as the elements and attributes are respected so the pre-build tool can identify and extract the CoffeeScript and CSS bits.

Let’s examine each of them in order.

The HTML Template

This is the meat of the template. In order to improve application performance, loading the templates is a multi-step operation that involves intermediary storage formats.

The first step consists in reading in all the necessary templates, parsing them to determine that no variables are undefined, and storing them as a JSON blog in the underlying CouchDB database. This is a manually triggered operation that happens whenever we modify the templates (it’s part of our deployment procedure). This step may also involve a bit of cleanup, such as removing semantically irrelevant spaces from the HTML (this cannot be done earlier, because some templates are plaintext instead of HTML, and only the application knows which is which).

The second step happens whenever a new instance of our application begins — maybe it died and needed to restart, maybe Apache decided it needed another worker process to handle a surplus of request, or maybe we added a new server to our web farm. The startup process of our application server does not read anything from the disk — instead, it will read in all the template data from the database, along with all the other bits of configuration: internationalization strings, third party API keys, feature branch triggers, and so on. Then, it will compile every template down to optimized closure-based opcodes for a hole-filling virtual machine.

The third step happens whenever a bit of HTML needs to be rendered. The application provides the hole-filling virtual machine with a data object and a « writing stream » which is either the HTTP request stream or a JSON serializer stream, depending on whether the request is normal HTTP or AJAX. This is an extremely fast operation where no parsing or checks are performed.

On the application side, loading a template involves three things:

  • Declaring the type of the data object expected by the template.
  • Declaring the source file from the template (as a function of the language).
  • Declaring the hole-to-value mapping  to be used.

Here’s that loading code for the above template file:

module User = Loader.Html(struct
  type t = <
    id   : Id.t ;
    url  : string ;
    img  : string option ;
    name : string
  > ;;
  let source  _ = "users/list"
  let mapping _ = [
    "id",   Mk.esc (fun x -> Id.to_string (x # id)) ;
    "url",  Mk.esc (fun x -> x # url) ;
    "img",  Mk.esc (fun x -> BatOption.default img404 (x # img)) ;
    "name", Mk.esc (fun x -> x # name)
  ]
end)

module UserList = Loader.Html(struct
  type t = <
    users : User.t list
  > ;;
  let source  _ = "users"
  let mapping lang = [
    "list", Mk.list (fun x -> x # users) (User.template lang)
  ]
end)

One view is defined and loaded for every independent piece of HTML in the template. Here, there is an User view which represents the list item for a single user, repeated zero, one or more times ; and there is the UserList view representing the wrapper in which those list items will be placed.

The {v:foobar} syntax defines a variable hole. The corresponding view MUST define a mapping for that variable, or an error will occur at deployment time.

The {{foobar: }} syntax is a variant: in addition to declaring a variable hole, it also defines such a sub-view, which can be loaded using template/foobar as the path.

The {t:foobar} syntax defines a translation hole. The template engine will automatically load the corresponding term from the internationalization dictionary used to render the template.

The Mk.esc and Mk.list are binding instructions which are used to compile the template to a virtual machine. The common binding instructions are:

  • Mk.esc f applies f to the data object, which returns a string. That string is then HTML-escaped and output.
  • Mk.str f is the same as above, but the string is not HTML-escaped.
  • Mk.i18n f is the same as above, but the string is translated as an internationalization term.
  • Mk.list f t applies f to the data object, which returns a list of data objects compatible with template t. That template is then used to render those data objects in order.
  • Mk.list_or f t e is the same as above, but if the returned list is empty, it instead uses template e to draw a « list is empty » message.
  • Mk.sub f t applies f to the data object, which returns a single object compatible with template t. That template is then used to render the object.
  • Mk.sub_or f t e is the same as above, but f returns an optional type. If it is missing, then template e is used to render an « object is missing » message.
  • Mk.text f provides f with the current writing stream and internationalization object, so that it may directly write HTML to the output. This is how most rendering helpers such as « render a currency amount » are used.
  • Mk.box f is the same as above, but the writing stream supports the addition of arbitrary javascript code to be executed by the client as part of rendering the template. This is how javascript-dependent rendering helpers such as « render a datepicker » are used.

The data type is defined in the view itself, either explicitly (as I did above for the sake of clarity) or by using an existing type from your application — if the application already had an user module with the appropriate data type, I could have used that type instead.

By specifying views in this way, the data required to render a template is made available to the compiler for type-checking, and missing bindings are detected during deployment (usually to a local test server). This has made template-related errors exceedingly rare — once the HTML is done, it becomes extremely hard to use it wrong.

Although this feature is not currently in use, the virtual machine semantics also allow compiling it down to JavaScript. This would allow us to send the rendering code to the client as a one-time cost, and send a much smaller data package through AJAX whenever something new needs to be rendered.

The CoffeeScript Layer

We use CoffeeScript because it’s more elegant, shorter, and includes a compiling-to-javascript step that lets us detect syntax errors at deployment time. Yes, compile- and deployment-time are my favorite buzzwords, because I enjoy the feeling of safety that they bring.

As mentioned above, the actual CoffeeScript is removed from the template in a pre-processing step, and replaced with a hole that says « call JavaScript function #33 now » that happens to define a list of parameters matching the params attribute of the original script element.

So, starting with the script element from the example above:

<script type="text/coffeescript" params="save_url">
list = @$.find 'ul'

save = =>
  ids = [];
  list.children('li').each ->
    ids.push $(@).attr 'id'
  @ajax save_url, ids

list.children('li').sortable
  change: save
</script>

If this is the 33rd script tag encountered by the preprocessor, then it would append the following to the complete CoffeeScript file:

@j33 = (save_url) ->
  list = @$.find 'ul'

  save = =>
    ids = [];
    list.children('li').each ->
      ids.push $(@).attr 'id'
    @ajax save_url, ids

  list.children('li').sortable
    change: save

And it would be replaced in the template file with this:

{j:j33:save_url}

This syntax (which can be used manually, although it should be avoided) is a javascript hole, it runs the specified function and provides by-name values for the arguments. The parser would notice that we are declaring an HTML view instead of a JS/HTML view and complain about it, so we would have to go back and re-define it:

module UserList = Loader.JsHtml(struct
  type t = <
    users : User.t list ;
    save_url : string
  > ;;
  let source  _ = "users"
  let mapping lang = [
    "list", Mk.list (fun x -> x # users) (User.template lang)
  ]
  let script  _ = [
    "save_url", (fun x -> Json_type.String (x # save_url))
  ]
end)

I have used Loader.JsHtml instead of Loader.Html, and defined a secondary mapping that is specific to JavaScript parameters, and which uses the data object to return JSON values.

How is the JavaScript called? Well, it really depends on how your JavaScript library handles it. On non-AJAX HTTP, Ozone will try to inject all JavaScript calls in a script element at the end of the HTML body. In AJAX mode, Ozone allows you render a template to a JSON object representing both the HTML and the JavaScript together, and it is the responsibility of the code that made the AJAX request to receive that object, place the HTML wherever applicable, and then “run the JavaScript”.

By convention, the JavaScript is called using a client context as itsthis value. The client context is an object which may contain whatever the caller finds interesting to place there, along with a variable named $ which should be a jQuery selection containing the root element of the previously rendered HTML. Hence, @$.find 'ul' would select the list in the rendered HTML, instead of all the lists on the page.

The LESS CSS Layer

This is the least interesting of all three layers. The LESS CSS code is extracted, appended to a single file, and compiled to CSS (which, again, is an useful deployment-time syntax check). The point of this feature is simply to let the designer place element-specific CSS next to the element, instead of having it exist in an external file and cause trouble with asset garbage collection (can I remove this rule or is it still used anywhere?) External files still exist, though, for CSS rules that are not limited to a single template.

Bonus : the triple hash

How do I define some code that should be called when a button is clicked? Defining it directly in the onclick method is ugly, hard to read and does not let the application provide parameters, so what else can I do?

The solution is to use an intermediary global object that happens to be the same for the entire file — a pattern that stores any template-related JS in a global variable named after __FILE__ !

Yes, it is a hack, but it’s a simple and useful one.

The only difference is that __FILE__ is spelled ###.

<button type="button" onclick="###.frobnicate()">{t:frobnicate}</button>

<script type="text/coffeescript" params="message">
###.frobnicate = ->
  alert message
</script

Article Image © gdbg12 — Flickr

Functional Programming

Yesterday evening, I was present at Paris Hackers, a meetup for Hacker News enthusiasts — and a «hacker» in this context is not someone who breaks into software systems, but someone who is passionate beyond words about how software works. But the definition is vague, and that vagueness is part of my surprise.

We had a little chat on the topic of recent buzzwords — concepts that people were interested in, but not everyone understood. One of them was Functional Programming. The speaker asked who knew enough about it to explain to the others, and I naturally raised my hand: I was taught functional programming formally, I taught it myself, I implemented a functional programming language, and I am using functional programming professionally on a daily basis.

My surprise was that I was the only one. I had sort of assumed that a majority of «hackers» would know what functional programming is. I will have to revise that definition.

« Having Functions »

Functional programming itself has a definition problem: depending on whom you ask, you can get two different definitions on functional programming. One definition is what I would call pure functional programming, and the other is better summarized as « having functions »

Of course all languages have something they call «functions» but I strongly believe that there is a list of minimum requirements before you are allowed to say that.

  1. You should be able to define a FOOBAR anywhere. At global scope. Inside a class. Inside another function. Inside an expression. It can be an anonymous FOOBAR, or it can be given a name, depending on the situation.
  2. You should be able to store a FOOBAR in a variable, pass it as an argument to a function, and return it from a function. If your language is strongly typed, it should have an easily described type as well.

Replace FOOBAR with «integer» or «string» and you will notice that without these requirements one can hardly say a language «supports integers» or «supports strings» — what if you could not store integers in variables, or write string literals as part of expressions?

These requirements are called «having FOOBARs as first-class citizens» by the way. In C for instance, functions and arrays are second-class citizens, and integers are first-class citizens. In JavaScript, functions are a first-class citizen.

Allowing functions helps make a language a lot more expressive. Let’s think of a quick example: performing an HTTP request and displaying the result in a window. In pseudocode that looks shamefully like JavaScript code, it would look like this:

function display(url) {
  var page = http(url);
  return new Window(page);
}

This function has an obvious limitation: if it takes five seconds to load the data through HTTP, then the program will freeze for five seconds and only then display the window. A better solution would be to make the request asynchronous: a window in a «loading» state is returned immediately, and a background process takes care of the HTTP requests and fills the window with the contents once they have been received.

A clean way of encapsulating this would be to let the http function handle the asynchronous behavior. The calling code would only need to provide a description of what needs to be done once the contents have been received. There are two ways of doing this in an object-oriented world.

Object-oriented solution #1: inherit from an http request class.

class GetPageRequest extends HttpRequest {

  // We need to know what window to put the contents in
  function GetPage(window,url) {
    super(url);
    this.window = window;
  } 

  // Override this function to react to the data being received
  function onSuccess(page) {
    this.window.setContents(page);
  }
}

function display(url) {
  var window = new Window(loading);
  var request = new GetPageRequest(window,url);
  request.runAsynchronous();
  return window;
}

Object-oriented solution #2: implement an «HTTP success event handler» interface.

class OnReceivePage implements IHttpRequestSuccessHandler {

  // What window do we need to fill ?
  function OnReceivePage(window) {
    this.window = window;
  }

  // Handle success (as previous example)
  function onSuccess(page) {
    this.window.setContents(page);
  }
}

function display(url) {
  var window = new Window(loading);
  http(url, new OnReceivePage(window));
  return window;
}

Both solutions are quite long, because object-oriented solutions involve a certain amount of syntactic overhead. In both cases, the only code that is actually relevant to our work is the contents of the onSuccess function. What if the language allowed us to define that function directly, and passing it to the HTTP request without any syntactic fuss?

Functional solution #1: local function definition.

function display(url) {
  var window = new Window(loading);
  function onSuccess(page) { window.setContents(page); }
  http(url, onSuccess);
  return window;
}

In fact, why bother with giving the function a name? Its contents are obvious enough, so it would be shorter and cleaner to just passing it to the HTTP request without any name.

Functional solution #2: anonymous function definition, or «lambda»

function display(url) {
  var window = new Window(loading);
  http(url,function(page) { window.setContents(page); });
  return window;
}

The latter example is syntactically correct JavaScript code, by the way. As you can see, the code is both shorter and more readable than the class-based version. In general, using inheritance or interface implementation to define a single method is always significantly more wasteful that using an anonymous function.

There’s a third requiremend that I left unvoiced. Look back at the previous examples again. Notice how the class-based examples have to store the window object explicitly, but the functional versions use the variable directly? This is known as «closures» : when defining a function inside a function, the defined function actually carries with it all the variables that were present when it was defined. Functional languages always have closures, because any non-trivial operation you might wish to place in an anonymous function will involve accessing data that was present near the definition.

In summary: functional programming involves the ability to manipulate functions as you would manipulate any other data type (define anywhere, store, pass, return…) and allows writing more concise programs whenever you need to pass, store or return some behavior to be executed later.

Examples of functional languages:

  • JavaScript
  • Lisp (including Common Lisp, Scheme …)
  • ML (including SML, OCaml, F# …)
  • Haskell
  • Ruby
  • C#

Examples of near-functional languages:

  • PHP only supports lambdas and closures since 5.3, and closure syntax is suboptimal.
  • Python has 99% support, the only issue is that only one-line lambdas are allowed.
  • XSLT 2.0 has no anonymous functions, but is otherwise functional. I kid you not.

Examples of non-functional languages:

  • Java
  • C
  • C++

Pure Functional Programming

This is the other definition of «Functional Programming» that you can hear, and the confusion is understandable: languages that match definition #2 have long been the only examples of definition #1 as well.

The relevant concept is purity: can the impact of a function on the rest of the program be described only in terms of its return type, and can the impact of the rest of the program on the function be described only in terms of its arguments?

Consider the signature of an interface method:

public int frobnicate(Foo f);

There are many ways in which that function can affect the rest of the program:

  • It could call a global function to perform changes in the global state, such as altering a singleton, setting a global variable, printing data to the screen, sending data over a socket…
  • It could change member variables of the object it is called on.
  • It could call methods on its argument.
  • It could return a new value that is then used by the calling code.

There are also many ways in which the rest of the program could affect the behavior of the function:

  • It could depend on global variables to decide what it should do.
  • It could read global state (user input, files, sockets…)
  • It could access the data of the object it is called on.
  • It could access the data of its argument.

The entries marked in bold can be deduced from the method signature: having a return type means you want to return something, being non-static means you wish to read the state of the object you are called on, having an argument means you want to read data from that argument. The other entries are possible, but not readily apparent in the signature.

Pure functional programming is the principle of least surprise: sure, those entries that are not in bold are possible, but it would be surprising if they happened, so they are in fact not allowed. A pure function can only read data from its arguments and can only write data by returning it.

Also, pure functions imply that data is immutable — if functions are not allowed to change anything, then they are not allowed to change data structures, so the data structures are by definition immutable.

By the way, pure functional programming is all about what you cannot do — so you could in theory write functional programming in any language just by following coding conventions to that end. In practice, immutability is quite hard to respect if there is no syntax for making it easier — you spend a lot of time creating manual copies of objects in order to change a single field.

Working with pure functions and immutable data structures involves both benefits and trade-offs when compared to non-pure programing. Think of it as a slider which goes between «0% Pure» and «100% Pure» and you pick one value for your entire program. Many algorithms and concepts get easier to understand, change and manipulate as the slider moves towards the 100% end, but there are some concepts that are intrinsically non-pure and actually get harder and harder to think about as your slider moves away from the 0% end. For instance, working with a persistent data store is an intrinsically mutable endeavor, and it’s hard (but possible) to think about it in pure terms.

I find that a slider value of 95% in a language that makes such a value possible is the optimal place to be — the 5% includes writing code that is truly non-pure (such as database manipulation, responding to HTTP requests…) as well as code that is non-pure on the inside but is actually pure from the outside (such as memoization and caching).

Article Image © Tim Olson — Flickr

Using OCaml from CouchDB views

What follows is directly taken from my latest GitHub project, which provides an adapter for transforming OCaml applications into CouchDB view servers. The programmer writes an OCaml application that exports one or more map and reduce functions using the API found in module CouchAdapter, and creates a CouchDB design document that specifies the application path and the name of the exported functions. The adapter server then receives evaluation requests from CouchDB and passes them to the application, and returns the result back to CouchDB.

The objective of this project is not to support writing OCaml code directly into views! The OCaml code should follow the standard build procedure, the only exception being that the CouchAdapter API is used to export that code and make it available to the adapter server.

Requirements and setup

This adapter uses json-wheel for representing JSON values, and the build process requires OCamlBuild. There are no other direct dependencies. Building the adapter server runServer is fairly straightforward: make byte or make native generates runServer.byte or runServer.native respectively. Move the resulting application to an appropriate location on your system and allow CouchDB to execute it. My suggestion is:

cp runServer.native /usr/bin/couch-ml-adapter
chmod a+x /usr/bin/couch-ml-adapter

I will be assuming this convention for the rest of this manual. Once the server is built and installed, you need to configure CouchDB to actually use that adapter to execute OCaml views. Edit the local.ini configuration file of your CouchDB server (usually found in /etc/couchdb/local.ini) and add the following lines:

[query_servers]
ocaml=/usr/bin/couch-ml-adapter

Depending on your configuration, there might already be a [query_servers] section. If that is the case, add the second line to that section. If you have trouble configuring your query servers, read the CouchDB documentation.

Errors that happen while executing the adapter will appear in the CouchDB logs (usually found in /var/log/couchdb/couch.log).

Architecture

Query Servers

The CouchDB server usually evaluates map and reduce functions only when a design document containing those functions is queried by a client, by following this process:

  • If the query server configured in local.ini is not already running, start it.
  • Send various instructions on the query server’s STDIN, such as “apply the map function F to document D”
  • Read the results on the query server’s STDOUT.

The Adapter Server

The adapter server provided by this project is one such query server. When it must apply a function to a document, it does the following:

  • Determine which application provides the function.
  • If the application is not already running, start it.
  • Send the request to the application’s STDIN, read the answer on its STDOUT.
  • If the application responds with results, send these back to CouchDB.

In short, the overall architecture looks like this:

+---------+         +------------------------+
|         | <-----> |  Haskell Query Server  |
|         |         +------------------------+
|         |
|         |         +------------------------+
| CouchDB | <-----> | Brainfuck Query Server |
|         |         +------------------------+
|         |
|         |         +------------------------+
|         | <-----> |                        | <-----> [ Application /home/nicollet/test ]
+---------+         |  OCaml Adapter Server  |
                    |                        | <-----> [ Application /usr/bin/foo ]
                    +------------------------+

The programmer should therefore write an application which reads the adapter requests on STDIN, runs the requested functions on the provided documents, and sends the results back on STDOUT. All the boilerplate involved is handled by the CouchAdapter module, so that the actual development process you will be following is:

  • Include any modules you might need to use in your view.
  • Define the map or reduce function as an OCaml function.
  • Register that function as being exported with CouchAdapter.export_map and CouchAdapter.export_reduce.
  • Call CouchAdapter.export()

Importing From CouchDB

CouchDB references map and reduce functions in design documents, using the following syntax:

{ "_id" : "_design/..."  ,
  "language" : "...",
  "views" : {
    "foobar" : { "map" : ... }
    "quxbaz" : { "map" : ... , "reduce" : ... }
  }
}

In order to use the OCaml adapter, one must first set the language property to "ocaml". Then, to reference the function "extract_foo" defined in application /usr/bin/foo, one would write:

"views" : {
  "foobar" : { "map" : ["/usr/bin/foo", 1, "extract_foo"] }
}

The same syntax applies for reduce functions as well. The three components of the definition are 1- the absolute path to the application that exports the function (this is how the adapter server knows what application to run), 2- a version number discussed in the next section and 3- the name under which the function is exported from that application.

Function versions

For performance reasons, once an application or query server has been started, it is never shut down. This only causes problems when there’s a new version of the code that needs to be deployed. The adapter server provides a versioning system which automatically detects that a function.

A CouchDB design document requests a function that is at least a certain version. For instance, ["/usr/bin/foo", 42, "extract_foo"] indicates that the adapter server should find version 42 or greater of the function "extract_foo" exported by application usr/bin/foo. If that application is currently running and the function is either missing or older than version 42 then the application is shut down and started anew in a completely transparent fashion.

Note that if rebooting the application still fails to provide an appropriate version of the function, the adapter server will report an error, which CouchDB will propagate to the client. This makes all the views inside the design document unavailable until an appropriate version of the application is deployed.

Failing to manage function versions both in CouchDB and in the application can lead to data inconsistencies, as different documents are processed by different versions of the same function. Only a global version change which prompts a full refresh of the view and reloads the application can ensure data consistency in the face of code changes.

Creating a map function

A map function must follow the signature json -> (json * json) list: the argument is the entire document being processed, and the output is a list of key, value pairs being output by the map function.

For example, suppose you already have an User module in your application, which is used among other things for reading and writing users to the CouchDB database:

type t = {
  active  : bool ;
  name    : string ;
  email   : string ;
  picture : string
}

let of_json = (* ... *)
let to_json = (* ... *)

Then you can rely on that module to define a map function with the above signature, and export it using the CouchAdapter module:

open Json_type

let user_by_email json =
  try let user = User.of_json json in
      [ String user.User.email , Null ]
  with _ -> []

let () =             

  CouchAdapter.export_map
    ~name:"user_by_email"
    ~version:1
    ~body:user_by_email ;

  CouchAdapter.export ()

Should you decide to update the view code, make sure that you also increment the version number:

open Json_type

let user_by_email json =
  try let user = User.of_json json in
      if user.User.active then [ String user.User.email , Null ]
  else []
  with _ -> []

let () =             

  CouchAdapter.export_map
    ~name:"user_by_email"
    ~version:2
    ~body:user_by_email ;

  CouchAdapter.export ()

Creating a reduce function

There is no distinction made between reduce and rereduce. While this causes a slight loss in functionality it also makes writing reduce functions less arduous given the OCaml type system. The signature of reduce functions is simply json list -> json.

For example, let’s assume that an Article module is already defined in your main application:

type t = {
  title : string ;
  html  : string ;
  tags  : string list
}

let of_json = (* ... *)
let to_json = (* ... *)

We now define a map function and a reduce function that counts how many articles are published for every tag.

let by_tag_map json =
  try let article = Article.of_json json in
      List.map (fun tag -> String tag , Int 1) article.Article.tags
  with _ -> []

let by_tag_reduce json =
  Int (List.fold_left (fun acc -> function Int i -> acc + i | _ -> acc) 0 json)

let () =
  CouchAdapter.export_map "by_tag-map" 1 by_tag_map ;
  CouchAdapter.export_reduce "by_tag-reduce" 1 by_tag_reduce ;
  CouchAdapter.export ()

And the CouchDB design document is as follows:

{ "_id" : "_design/article",
  "language" : "ocaml",
  "views" : {
    "by_tag" : { "map"    : ["/path/to/app", 1, "by_tag-map"    ],
                 "reduce" : ["/path/to/app", 1, "by_tag-reduce" ] }
  }
}

Article image © Miriam Rossignoli — Flickr

Short OCaml Diffs

I’m working on some wiki functionality right now, and one of the basic properties of a wiki is that history is kept for all pages. And this should be done elegantly, without actually keeping a copy of every version, especially since many changes are actually just small fixes. This is usually done by keeping a diff — a small piece of data that stores the changes needed to get from version A to version B. Computing an arbitrary diff is a fairly easy task, but computing a diff that is as small as possible is actually challenging, because there are many ways in which a diff can be created and only a few of these are optimal.

The standard approach to computing diffs is to find the longest common subsequence — a non-contiguous sequence of characters that is found in both versions, such that one may turn version A into the sequence only by removing characters, and then turn that sequence into version B only by adding characters. The list of additions and removals is what is commonly referred to as a diff, because this is how the UNIX program diff works.

I am not happy with this solution. For one, it works based on lines, which is fairly acceptable for source code that is cleanly split into many lines, but the average wiki page is not as clean. Instead, every paragraph tends to stay on its own line and is only word-wrapped for convenience. Using a line-based diff tool on such data would cause the removal and addition of entire paragraphs, which is certainly not optimal if the change was a minor typo fix on a single word. And since the diff algorithm itself is quadratic, it cannot be coaxed to work on characters instead of lines : a one-millisecond line-based run would take ten seconds on characters !

Another shortfall of the longest common subsequence approach is that it fails to detect sentence or paragraph swaps, yet these happen quite frequently in the rewriting stages of a wiki page. Indeed, regardless of how clever the diff algorithm is, the diff format itself provides no primitives for moving data around, only “add this” and “remove that” are supported.

I have designed and implemented a custom diff algorithm. Instead of working based on “add” and “remove” primitives, it carries a data segment where all the new content is stored, and works with “blit from old” and “blit from data segment” instructions to construct the new version from the old. Its format is optimized to be stored as JSON, and the diff application algorithm is simple enough to be handled by a short piece of JavaScript if necessary. It’s on GitHub, by the way.

For instance, consider the sentences “The quick brown fox” and “Brown foxes are quick“, and how the former can be transformed into the latter.

A longest common subsequence approach (on characters, assuming this is sane) would determine that the LCS is the seven characters “e quick“, and so it would be written as “ThBrown foxes are quick brown fox” which is indeed the shortest diff you can get using the LCS approach.

My algorithm instead determines that contiguous substrings “rown fox” and “e quick” are present in both sentences, so the diff would be written as “Brown foxes are quick“. The corresponding JSON is quite short, too:

["Bes ar",1,[10,8],5,[-12,7]]

The first element is the data segment (the five new characters), single integers are blit-from-data-segment (length) instructions, and integer pairs are blit-from-old (offset,length) instructions.

The theoretical algorithm complexity is O(m+n), the actual implementation on GitHub is worst-case O(mn) because I used lists instead of hash tables at one point, but should actually be O(m+n) on any input that happens to be written in a human language because working on a human language almost guarantees that those lists will contain zero or one elements in most situations.

The algorithm itself relies on statistical properties of character pairs to quickly match up sequences from both strings, such as by finding out that the ox pair only appears once in each sentence, or that the Br pair only appears in the new sentence. On medium-sized texts (up to 50,000 characters), there are several character pairs that are unique or missing, which helps the algorithm guess correctly where each piece came from. Once the source of the pieces is identified, generating the diff is extremely simple.

Agile Code in OCaml

First, a quick bit of background: I’m working on RunOrg [fr], a start-up that provides communities with their own online private social networks à la facebook. The technology stack is Linux-Apache-CouchDB-OCaml, and this has some implications that I will discuss below.

Facebook has it easy in terms of user management: an user starts existing on their platform the instant they sign up, at which point they fill in their first name and last name, and these are displayed to anyone who is allowed to see any hint of the user’s existence. So, making the first name and last name mandatory are quite acceptable.

At RunOrg, we cannot do this for several reasons:

  • User profiles may be created by communities as part of the membership management toolbox: we have to rely on user A to provide data about user B, and user A usually relies on an email-only source (such as newsletter or mailing-list registrations) where no first name or last name is available.
  • A given user may be part of several independent communities, and may choose to manage their identity separately for each one: appear as John Doe in an innocuous community they trust and John Censored in a more critical community.
  • We also allow users to keep control over whether a community is allowed to publish their name on the internet (as part of the online directory, or as comments on public articles).

Our needs for advanced privacy controls involve a more complex management of both what data is available and how we display it. The good news is, it’s certainly possible to handle all of this elegantly in terms of implementation. The bad news — we didn’t plan everything ahead.

It was only a few days ago that our customer requirements sessions brought up the issue of email-only sources: community managers were frustrated by the fact that our mass import functionality required first names and last names. The problem is, in almost every single programming language out there, making a required field become optional is a very dangerous endeavor because the development team must audit the entire code base to identify which parts of the code assume that the field is required, and describe what should happen when the field is null.

In your average PHP project, try making the user name optional and I can assure you that sentences like «You have been invited by  to this event» will appear. Someone failed to audit the who-invited-you-to-events code. At least with Java or C# you will get a Null Reference Exception of some kind that will show up in the logs and give you the opportunity to hunt down the mistake.

The good news is that our implementation language OCaml, does not allow null values. Instead, optional values are handled using a different value type, known as 'a option, which changes everything. An optional value simply cannot be accessed in the same way as a non-optional value. Trying to do so anyway will cause a type error that is picked up by the compiler, so a programmer can rely on these errors to quickly identify all locations the code that assume the value to be present.

I’ll say it again: in OCaml, a field being optional or mandatory is an assumption that is build into the type of that field, so changing the assumption involves changing the type and breaks all code that does not match the new assumption. Applying breaking changes to an OCaml code base is usually as simple as following a trail of compiler errors.

So, that’s what we did. We already knew that the behavior we wanted was to construct the “display name” of users like this:

  • If either the firstname or the lastname are present, use them (if both are present, use firstname-whitespace-lastname).
  • If none were present, then the private display name (visible only to the user themselves on their profile page and in the e-mails they receive) should be their e-mail address and the public display name (visible to everyone else) should be the username part of their e-mail address (so john.doe@gmail.com is shown as john.doe@…)

First, we defined two functions that compute the private and public display names based on the first name, last name and e-mail of the user. Then, the compiler error trail led us to all locations where a change was required, where we quickly identified whether the public or private name was to be displayed and replaced the existing code with our new display name functions. In total, a full audit of 40kLOC was done in less than an hour and I have proof that any code that uses the user name now handles the case where the user name is not provided.

The Rules

When working on any OCaml project, and especially on RunOrg, I follow these few rules:

  • Any assumption must cause a compiler error when broken. Either the code determines on the spot that the assumption is true, or I use the type system to prove that another part of the code already did. This rule took a massive toll on my early productivity, and I attributed it to an inherent cost of making compiler-enforced assumptions, but the real reason was that I was still pretty new at it — the elementary assumption enforcement from my smaller projects was too crude for the needs of the richness of RunOrg functionality, and it took me six months to refactor my early approach into an elegant and streamlined strategy of encoding assumptions into types.
  • Don’t work around the compiler or cheat with semantics. The initial reaction to a system that complains about every little change you make is to try and work around it by using more generic types or storing information where it does not belong. For instance, an easy solution to the optional name conundrum would have been to store john.doe@… as the name, but doing so would have been semantically incorrect (that’s a placeholder, not a name) and would have polluted the database “name” field with things that are not names and that will be treated differently from names at some point in the future.
  • Don’t accept mediocre code or patterns. Sometimes, design choices in the interface of module A will lead to ugly code in modules B, C and D because an unforeseen usage pattern happens to apply 95% of the time and the interface of module A was not designed with that usage pattern in mind. No amount of cleanup or refactoring in modules B, C and D will solve the problem, the only solution is to go back to the design of module A and change the interface even if it means that two hundred client modules will break. Keeping my code clean, elegant and short is worth wading through two hundred modules.
  • Perform lazy payments on your technical debt. I can propagate new design changes through your entire code base in one coding session, but this doesn’t mean I should. Instead, I keep a mental todo-list of all the changes that need to be applied, and apply all of them at once, locally, whenever I have to rework a given piece of code for any reason. While it may seem that such a todo-list is hard to keep and I will inevitably forget parts of it, remember that those design changes came around in order to solve the problem of ugly or mediocre code — by noticing that the code is ugly, I am reminded of the strategies that I set up in order to clean it up.
  • But be eager with small payments. If it’s a matter of moving a few functions around or refactoring a small piece of code, I do it as soon as I am done writing or rewriting it. Cleaning up little odd bits in a mostly clean code base is extremely rewarding.
  • Discover code by trying changes out. If the assumptions are correctly laid out, then the easiest way to determine the implications of a change — whether it will work and how long it will take — is simply to try it out. Following the compiler error trail will quickly reveal how many things are impacted by the change, as well as any unforeseen massive consequences. If it turns out that the change is too impacting, I just roll back my edits.
  • Keep interface patterns to a minimum. The basic idea behind having few different interfaces implemented by many parts of the system is usually expected to be «code is easier to reuse» but I disagree. Yes, that is a frequent benefit, but certainly not the most essential. Having few different interfaces means that most of my code can be described using a small vocabulary of interface patterns, and that looking at some code immediately reveals the pattern being used there. It also means that any design changes can be expressed in term of pattern changes, and can be applied almost blindly to all locations where that pattern was used. Last but not least, by using a simple shared vocabulary for large sections of the applications, I make it easier to recognize patterns in the more chaotic sections based on how they interact with the cleaner code. It’s easier to determine that two sentences have the same meaning if they share some words.
  • Love your code. In the RunOrg code base, priority 3 is making sure the code is well-designed, clean and free of technical debt, priority 2 is adding new features, priority 1 is making sure there are no bugs, and the drop-everything-you-do-and-work-on-this priority zero is that I should never hate working on the software. Motivation is paramount to keeping the code clean, feature-rich and bug-free, and even to working on the start-up in the first place, so anything that might make me question my dedication to the project or cause me pain while working on it must and will be corrected as soon as possible, regardless of other priorities.

I’m pretty certain that all of the rules are important, but I do believe the last one is an absolute prerequisite.

Article image © Ergonomik — Flickr

Does this repository make me look fat?

The main RunOrg SVN repository contains:

  • 38490 lines of OCaml implementation code spread over 273 *.ml files.
  • 5134 lines of OCaml signature code spread over 156*.mli files.
  • 4776 lines of HTML template code spread over 271 *.htm files.
  • 1971 lines of CSS in 6 files.
  • 1898 lines of JSON configuration in 26 files.
  • 1172 lines of JavaScript in 5 files.

That’s a total 53441 lines in 837 files.

Here’s a plot of how the 38490 lines of OCaml code came into being over the last eight months.

That is all.

Dealing With Huge Projects

Right now, I’m the only developer working on RunOrg, which happens to be a 45k-line project written in OCaml. According to a common terseness observation, this is equivalent to managing a 135k-line project in Java. Alone.

OCaml shares a problem with many dynamic languages : it’s very expressive, but there is no general consensus on what architectural best practices should be, so there are literally dozens of different ways a given feature might be implemented that cannot be discriminated on anything but taste. This leads to a variety of unique design choices throughout the application which, despite working well with each other, cause programmers to «discover» new architectures every time.

In the end, I believe that the philosophy of using the best tool for every job can easily be taken to painful extremes if you are not careful. You encounter a new problem, pick an unusual but well-adapted solution, and it makes perfect sense to you, so you move on. Months later, you come back and the solution does not make sense anymore because you have forgotten a small detail about how it works or why it was done this way, and you have to hunt that small detail down by reading the code. I’ve pretty much solved this anti-pattern, so I’ll come back to it later.

The main point I’m making here is that for every project, there is an ideal mudball of code that happens to perfectly implement everything without bugs, all in a single gigantic file, and you cannot write this mudball. For a human, there’s no way to manage anything mudballish past a few hundred lines because you cannot wrap your mortal mind around the possibility that every line might interact with any other line in the project… so, as an architect, you slice up the mudball into more acceptable bits that you politely call «modules» in order to reduce the number of things any given line might interact with. We reduce the amount of data we need to cope by adding big «you don’t need to think about this» signs everywhere (and making sure the signs don’t lie, obviously).

And on a small scale, this works, because you only have a dozen modules and it’s enough to fit in your short-term memory. RunOrg currently has 260 high-level modules, and several times that amount in sub-modules. No UML design, no matter how comprehensive, can make all those modules fit in my mind at once. I must find some «you don’t need to think about this» signs before I can move on.

There are mostly two ways of slicing a given project into modules: horizontal and vertical.

Vertical slices happen when there are dependencies, and modules look like layers stacked on top of each other with each layer being allowed to access the layers below. The RunOrg project architecture actually starts with a clean set of vertical slices : the controller layer deals with HTTP actions by using the view layer and the model layer below it, but the view layer cannot access the controller layer, and the model layer cannot access either.

Horizontal slices happen when there are absolutely no dependencies, and modules look like books cleanly arranged next to each other on a shelf. This usually happens when those modules represent the same concept for different purposes. In the RunOrg project, the controller layer is divided into many action modules, with each of these modules handling the HTTP requests for a limited part of the application. For instance, there’s a Login module in charge of handling HTTP requests related to logging in, and a File module in charge of handling HTTP requests related to uploading files. The concept is the same (handle HTTP requests) but the purpose is different (logging in, uploading files). And there is no need for either module to know about the existence of the other.

Knowing whether slices are vertical or horizontal immediately tells the programmer about what dependencies should be considered for that slice. And it is all recursive : the Login module of the controller layer is further divided into a Login_common bottom layer for common definitions, the root Login top layer for binding everything together, and an intermediary layer of horizontal Login_form, Login_signup, Login_lost slices dedicated to the various independent aspects of logging in. The naming convention helps identify the pattern used.

In practice, the slices do not necessarily map to actual namespaces or modules because, especially at very low levels, the granularity involved to segregate the two would be too verbose. For instance, while it may appear that the controller layer is made up of modules that are all horizontal slices, this is not the case : while the actions (functions that respond to HTTP requests) are indeed independent horizontal slices, the layer also contains helpers (functions that provide common functionality to actions) that follow a vertical layering, and a given module will usually contain both actions and helpers indiscriminately.

What is relevant here is that the patterns used will let you determine easily what kind of slice you are dealing with. And a pattern is a named convention (action, action helper, view template, table) that is respected by relevant pieces of code, in terms of :

  • Location : where is it within the module and file hierarchy, and in relation to other constructs within the same module ?
  • Structure : how does the code look like ? What parts of the pattern are expected to be changed and what parts should always be the same ?
  • Type : what is the signature of the module, class or function defined by the code ?
  • Name : is there a common suffix or a way to give a name to entities following the pattern ?

These are guidelines, a pattern should usually have at least one of these, and the more the better, but you don’t have to implement all four if it is counter-productive to do so. Also, a pattern should define a dependency rule : it is generally understood that two pieces of code that follow the same pattern have a dependency dictated by that pattern, and that dependency is usually a horizontal slice.

The important thing about patterns is that they are not an external influence on your project. If you limit yourself only to those patterns that are dictated by the Gang of Four book, or by the framework you are using, then you will miss out on the many patterns that will emerge naturally within your application. Quite to the contrary, it is essential to identify as often as possible the patterns that appear in your code, clean them up by providing both a name and conventions of location/structure/type/name, and apply them wherever necessary. This will make your code more easily recognized by the programmer, because there are only a handful of fairly generic concepts to learn (the patterns) and everything else can be understood by finding out what patterns are used. Even better, familiarity with patterns places a  «you don’t need to think about this» sign on the parts of the pattern structure that stay the same, because they never change.

And now, I have cleverly returned to my previous point : the inevitable conflict between the use the best tool rule and the use the same pattern rule.

Using the best tool creates the risk of writing code that is very difficult to understand later on, because there are too many special cases. Using the same pattern everywhere causes problems when the pattern is ill suited to the problem being solved, such that it creates code that is too long, too repetitive, or too unsafe.

In my day-to-day routine, I follow the use the same pattern rule until it becomes too painful. Then, I just change the pattern to make it less painful to use, and propagate the changes to all the places where it is used, which in turn was made possible by the fact that I did use the same pattern everywhere.

Obviously, I don’t have a pattern for everything. So, whenever I encounter a problem for the first time, I go with the best tool instead. Once that kind of problem is solved several times, a pattern will emerge and some refactoring will happen.

Article image © holycalamity — Flickr



1170 feed subscribers
(readers who polled a feed this week)