Monthly Archive for November, 2011

Basic Patterns for Everyday Programming

Lakshen Perera provides a list of basic patterns for everyday programming, illustrated in Javascript and Ruby. I thought it would be interesting to provide an OCaml illustration as well, and perhaps a handful of additional patterns as well.

Verify object’s availability before calling its methods or properties

In many languages, there is a possibility for objects to be missing — whether this is represented as NULL, null, nil or None. Regardless of the language, it is important to keep in mind at all times whether a given value is optional or not. If it is not optional, then you may assert so using a type declaration if your language supports it :

method name  : string (* not optional *)
method email : string option (* optional *)

In other languages, a runtime assert will do, for instance in PHP :

assert (isset($this->name))

OCaml will not require you to explicitly declare all values as optional or mandatory. Instead, it will deduce that information from the way you use each value. For instance, since json_of_string expects a non-optional string argument, you can simply write :

let parsed_content = json_of_string json (* json is not optional *)

If json is an optional string, this will create a compilation error unless you explicitly define what should happen when the string is missing. The most common approach is to decide that the result should be missing too :

let parsed_content = match json with
  | None   -> None
  | Some s -> Some (json_of_string s) (* by definition, s is not optional *)

If using the Batteries library (which, by the way, you should), you can express this more easily :

let parsed_content = BatOption.map json_of_string json

Set a default value with assignments

This advice is almost identical to the previous one : if you need to construct a non-optional value, but only have access to an optional value, you will have to provide a default value. Using standard OCaml :

let role = match person.role with
  | None      -> `Guest
  | Some role -> role

Using Batteries, there is an almost equivalent version :

let role = BatOption.default `Guest person.role

This version is almost equivalent, because it will evaluate the default value even if it is not required. Let’s assume that the default value is itself provided by a complex computation, such as a database access :

let role = match person.role with
  | None      -> readfrom database (* only executed if person.role is missing *)
  | Some role -> role 

let role = BatOption.default
  (readfrom database) (* Always executed, even if unnecessary *)
  role

The Batteries library provides an alternate function for that :

let role = BatOption.map_default
  readfrom database (* only executed if person.role is missing *)
  role

Checking whether a variable equals to any of the given values

If the values have a legitimate reason to be strings, integers or other types with unlimited numbers of values, then « is in list » predicates are the preferred choice :

if List.mem current_day ["Monday";"Wednesday";"Friday"] then
  (* do something *) 

Days are a quite bad example, because these are better represented as a variant (which then type-checks whether you have written Mornday instead of Monday). If you have already defined the type of weekdays, then :

if List.mem (current_day : weekday) [`Monday;`Wednesday;`Friday] then
  (* do something *)

If this is a one-shot check, or if you would rather not define a weekday type yet, you should instead go for an exhaustive pattern-matching :

 match current_day with
  | `Monday | `Wednesday | `Friday ->
    (* do something *)
  | `Tuesday | `Thursday | `Saturday | `Sunday -> ()

Extract complex or repeated logic into functions

This is a fairly fundamental concept — but it consists in two distinct parts. There is a separation and naming part (you pull out a piece of code and give it a name, which helps understand what it does and how it relates to the rest of the program) and there is an extraction and reuse part (the piece of code is pulled out into a more globally accessible location and parametrized, so that it may be used in other places).

With the above example, simple separation-and-naming would be :

let is_discount_day = match current_day with
 | `Monday | `Wednesday | `Friday -> current_date > 20
 | `Tuesday | `Thursday | `Saturday | `Sunday -> false
in

if is_discount_day then 
  (* do something *)

The variable is defined in the same scope it is use in, and it assumes that current_day and current_date values have been defined previously in that scope. Extraction-and-reuse would go further :

type weekday =
  [ `Monday | `Tuesday | `Wednesday
  | `Thursday | `Friday | `Saturday | `Sunday ]

let is_discount_day (day:weekday) date =
  List.mem day [`Monday;`Wednesday;`Friday] && date > 20

...

  if is_discount_day current_day current_date then 
    (* do something *)

Now, is-discount-day is a global function available from everywhere in the code, and it uses the provided parameters to determine whether this is indeed a discount day.

Memoize the results of repeated function calls

OCaml has several ways to perform memoization. One of them is lazy evaluation :

val discount_day = lazy (is_discount_day current_day current_date)
method discount_day = Lazy.force discount_day

The lazy expression will only be evaluated the first time the Lazy.force function is called on it.

Note that if the current day or current date can change, then the memoization actually breaks things !

Memoization is also helpful when dealing with a function that requires arguments, in which case a different result will be provided for each argument set. A common solution is to use a hash table to store these :

let fibonacci =
  let memo = Hashtbl.create 100 in
  let rec fib n =
    try Hashtbl.find memo n with Not_found ->
      let result = fib (n-1) + fib (n-2) in
      Hashtbl.add memo n result ; result
  in fib

This works fine for short-lived functions — don’t do this for global functions that might stick around for a long time, because the memoization hash table will grow and its contents will never be garbage-collected. If you really have to, use a weak hash table, such as Batteries’ BatInnerWeaktbl, so that the garbage collector may reclaim the memoized values when it runs out of memory.

Also don’ t overdo memoization — it only works when arguments are reliably passed more than once and the time to compute the value is significantly larger than the time to retrieve and store it and it is worth the memory usage and the function has no side-effects.

Use the seven list manipulation primitives

Almost any processing on collections of items can be expressed in terms of seven fundamental patterns. Recognizing those patterns can help improve the clarity of both the code and the underlying algorithm.

1. Map transforms a list into another, item by item, in linear time. Use a map operation when all you need is a one-to-one transformation. The line below extracts three recipes from the database using their identifier.

let recipes = List.map from_database [ "omelet" ; "cheeseburger" ; "risotto" ]

2. Reduce transforms a list of values into a single value by repeatedly applying a function that combines together two values into one. The typical example is a fold, which uses a function to combine each list element, in turn, with an accumulator. It can be used to extract the sum of values in a list, for example :

let total = List.fold_left (+) 0 [ 5 ; 6 ; 3 ; 8 ; 9 ; 0 ; 7 ; 6 ]

This transform allows a preliminary map step which transform the values inside the list into values that can be combined. For instance, to find the age of the oldest person in a list of people :

let oldest = List.fold_left (fun acc person -> max age person.age) 0 people

3. Extract works like a map, but the transformation function returns zero, one or more results for each call. All the results are included in the final list. The most elementary implementation is literally to have a map (that transforms a list into a list of lists) followed by a concatenation (that transforms a list of lists into a list). For instance, to get all the ingredients involved in a list of recipes :

let ingredients =
  List.concat (List.map (fun recipe -> recipe.ingredients) recipes)

4. Filter is a subset of Extract where the transform may not return more than one result — but it may still return none, so its result is simply an optional type. For instance, to extract the list of all recipes that have a wine recommandation along with their recommended wine :

let wines = BatList.filter_map
  (fun recipe -> BatOption.map (fun wine -> recipe,wine) recipe.wine) recipes

The OCaml language also provides a standard List.filter function which keeps values for which a property is true. For instance, to get the list of recipes that have a wine recommendation :

let have_wines = List.filter (fun recipe -> recipe.wine <> None) recipes

This approach is weaker — the wines in the resulting list are still treated as optional by the type system, so you will need bogus pattern matching for a case that never happens (no wine) to extract the actual wine values. A filter_map lets you encode the filter property in the type of the result, which makes using the filtered list easier.

5. Sort unsurprisingly sorts the list. The canonical sort — using the canonical order relationship — is a theoretical curiosity, and in practice most sorts use a projection function p such that A < B iff p(A) < p(B). This is best illustrated in SQL, in the form of the ORDER BY <projection> statement. Two useful helper functions :

let project compare p a b = compare (p a) (p b)
let descending compare a b = compare b a

For instance, to sort the list of recipes based on how long each of them takes :

let by_duration =
  List.sort (project compare (fun recipe -> recipe.duration)) recipes

6. Group works like Sort, but further regroups « equal » items together by returning a list of lists. It works using a comparison function that is usually based on a projection function. For instance, to get three lists containing one-star, two-star and three-star recipes :

let by_stars =
  BatList.group (project compare (fun recipe -> recipe.stars)) recipes

There is a special case when the projection function returns booleans, which is known as a partition. For instance, to extract recipes that are desserts and recipes that are not :

let desserts, non_desserts =
  BatList.partition (fun recipe -> recipe.is_dessert) recipes

7. Search extracts one element from the list (if possible) based on a certain condition or property. Elementary searches are « first element » and « last element. » More complex searches : finding a value by key in a list of key-value pairs using List.assoc or using a predicate using List.find. The heavy-duty search tool is BatList.find_map, used below to find a recipe that is recommended by at least one person, and the recommending person :

let recipe, recommender = BatList.find_map
  (fun recipe -> BatOption.map (fun person -> recipe, person) recipe.recommended)
  recipes

These seven patterns can be used in conjunction to perform almost any algorithm on collections, sequences or lists. For a more complex example, assume we need the ten ingredients that appear the most often in desserts. We would filter recipes by desserts, extract their ingredients, group them by name, sort the sub-lists by list length and take the name of the first element of the first ten sub-lists :

open BatPervasives (* for operator |> *)
open BatList

let ten_best_ingredients recipes = recipes
  |> filter (fun r -> r.dessert)
  |> map (fun r -> r.ingredients) |> concat
  |> group (project compare (fun i -> i.name))
  |> sort (descending (project compare length))
  |> take 10
  |> map hd
  |> map (fun i -> i.name)

Article © brewbooks — Flickr

Comment Branches

Your development job is making changes in your software. Writing, testing and debugging those changes takes some time.

If your job is anywhere as hectic as mine, you will have to fix and deploy urgent patches, even when your application code is in a half-written, half-debugged state because of the feature of the month.

This is what branches are for. You keep two versions of the code, one of which is called the trunk and is always ready for deployment, and another which holds the changes that you are working on.

When your feature is done, you merge the two versions together. You want to keep the merge operation painless. To do so, you have several kinds of branches available.

The repository branch is built into your SourceSafe/subversion/git/whatever. It creates two independent copies, and you need to migrate changes from the trunk to every branch out there as soon as possible, or the merge will make you wish for a sweet and merciful death.

By the way, changeset-oriented tools (like git or mercurial) make this easer, while revision-oriented tools (like subversion) make it harder.

The feature branch is done using programming logic. The code you deploy to production supports the new feature, but it is turned off for everyone except yourself. This technique is great for adding features, but inefficient when changing existing ones.

A side effect of the feature branch is that you can stress-test new code by rolling it out to increasing numbers of users progressively.

The comment branch is an odd gambit. It involves ripping out an entire module and replacing it with another that has a different interface. This will involve large amounts of re-wiring all over the code base, and these will take hours or days before they can be compiled, let alone tested.

Use a comment structure such as this one:

/*[*/ old code /*|* new code *]*/

It is trivial to build a text-replacement macro that turns the above into the code below and back:

/*[* old code *|*/ new code /*]*/

Use the macro to switch between development mode (when you write new code and desperately try to get it to compile) and fix mode (when you edit the old code and deploy it). For consistency, always commit the old version to the repository.

Why use comment branches instead of repository branches ? Maybe your source control tool sucks at branches. I use Subversion. Yes, I know. Legacy, pain and unlikely hopes of a brighter future.

When a trunk change occurs in a part that has been erased or reworked in the branch, that change will cause a conflict that will require manual intervention. Even with git or mercurial. For a large number of small changes sprinkled over a large codebase that is routinely involving many small updates, repository branches turn into a merge minefield.

Does your branch involve a small number of well-defined files ?

Then you should use repository branches, because conflicts will only happen in those files, and will usually be easy to fix.

Does your branch involve many changes in many files everywhere in the project ?

Then use comment branches.

Last and possibly least, there is the TODO-branch. This involves non-breaking, purely cosmetic changes. 25% of my project uses this syntax for historical reasons:

Table.get id |-> function
   | None       -> return 0
   | Some value -> return value.count

Then, a convention change happened, and this is used instead:

let! value_opt = breathe (Table.get id) in
match value_opt with  
   | None       -> return 0
   | Some value -> return value.count

Then, another convention change happened, and this should be used instead

let! value = breathe_req_or (return 0) (Table.get id) in
return value.count

And then, there’s the current version:

let! value = breathe_req_or (return 0) $ Table.get id in
return value.count

Whenever I change coding conventions, I do not spend the time to reformat the tens of thousands of lines of code in my application. That would have been wasteful. Instead, every time a piece of code is refactored, it is refactored to the most recent style.

The same happens when using an old and a new version of a given API. My code uses two libraries for handling HTML forms, uses both Javascript and Coffeescript, and a variety of similar two-hammers-one-nail situations.

These are, for all practical purposes, branches. They are work that is being performed for long durations. The benefit of TODO-branches is that code in the middle of such changes is still compatible with the trunk. It all happens in the head of the developer, who remembers what changes should be done the next time a piece of code is rewritten.

Article Image © Dominic Alves — Flickr

The Three Components of Negotiation

We negotiate several times a day, even if we do not recognize those occurences as such — « accept or walk away » is a fairly instinctive, if somewhat inefficient, negotiation strategy, and it goes hand in hand with the modern « take it or leave it » strategy employed by most mass consumer shops.

What is negotiation anyway, and why does it happen?

It happens because when humans cooperate, they get a cake that is bigger than the sum of its parts, and they need to decide how to split it. For instance, if I grow potatoes and you breed cattle, we could decide to put together my delicious Désirée tubers and your tasty Charolais steaks, which is a big improvement over otherwise potato- or beef-only diets. And then, we would argue over how much of the final mix each of us would get:

ME: We both contributed equally, so let’s get 50% each.
YOU: Breeding cattle is harder than watching potatoes grow, so I want 65%.
ME: Suit yourself, I know a guy who only asks for 55%.
YOU: Fine, I’ll go as low as 60% because I like your blog.
ME: I just told you I know a guy who…
YOU: …and that would save you the effort of finding that guy before lunch.
ME: Right. 60% it is, then.

You were a good negotiator, so you managed to squeeze 10% out of this. What happened? Let’s break this down.

Irrational Arguments

« We both contributed equally » might be true. Maybe I spent as much time growing my potatoes than you spent working in your ranch. But that is not my argument. What I have written between the lines is that it is fair for equal efforts to receive equal rewards.

And fairness is not a rational argument. It is a subtle way to circumvent rational thinking and aim for our unconscious ethical programming and, by definition, make decisions that are not in our best interest. In the eyes of a good negotiator, this is a loophole to be exploited.

There are many other such loopholes in the human mind. My favorites are:

Reciprocity is an atavistic impulse to give back whenever we receive something, even if that something is mostly unrelated to our objectives. In the example above, going from 65% to 60% is cleverly framed as a gift instead of a simple acceptance of my terms, which makes me unwilling to further haggle over whether it should be 57.5%.

Imprinting is our tendency, when we have no idea how much something is worth, to go with the first guess, clue or hint that we find about it. By suggesting a 50-50 trade, I imprinted you on the idea that potatoes are about as valuable than meat, and this serves as a basis for your « reasonable » 35-65 counter-proposal instead an extreme trade like, say, 3-97.

Guess what the market rate for potatoes and Charolais steak is? 3-97. I would like to correct my previous sentence:

You were a good bad negotiator, so you I managed to squeeze 10% 37% out of this.

And then, there’s a slew of emotional manipulation that does not appear at all in my example above. You can feign disappointment to try and get more out of a deal — “I really hoped you, of all people, could get this done for me.” You can pretend to be offended, shocked or angry in order to cause a large shift in the terms being negotiated — when you haggle in a bazaar, quote a price that is too low and you will be told that it is insulting, and kicked out of the store.

There are many other examples. Go read this blog article and read Predictably Irrational by Dan Ariely.

Rational arguments

You might expect this section to be as rich as the previous one, but it is not. There are only two different rational arguments you can say in a negotiation: “I will  not accept this outcome because [credible reason]” and “You will accept this outcome because [credible reason]“. Let’s examine our conversation with this convention:

ME: Suggests 50-50
YOU: No, because [breeding cattle is harder than watching potatoes grow], suggests 35-65
ME: No, because [I know a guy who only asks for 55%]
YOU: Suggests 40-60
ME: No, because [I know a guy who only asks for 55%]
YOU: Yes, because [that would save you the effort of finding that guy before lunch]
ME: Accepts 40-60

This is the reason why the most fundamental principle in negotiations is be willing to walk away. If you cannot or will not walk away, and the other guy knows this, then you can no longer say « No, because » and this cripples your ability to negotiate properly.

Typical reasons why I will not accept an outcome:

  • The costs are greater than what I get out of it. Maybe I’ll ask for more money, or non-financial benefits (exclusive rights, free advertising…)
  • It’s too risky for me. Maybe I’ll ask for insurance, or add terms to our contract that provide me with acceptable protection.
  • I can get a better offer elsewhere. You can either match that offer, give me something I cannot get elsewhere, or give up.

Typical reasons why you should accept an outcome:

  • There are additional benefits you did not take into consideration, such as not spending the time to find a better offer elsewhere, or my goodwill in terms of future trades, or my reputation as your client/provider.
  • If you do not accept now but change your mind later, I will not be able the same offer to you again in the future.

These reasons do not have to be true, they only have to believable. I don’t know anyone who would accept a 45-55 split on potatoes and Charolais steak, but you don’t know that, so I can still bluff my way through our negotiation, bringing you down from 65% (which is still a 32% win for me) to 60%. If you knew that no sane person would accept a 45-55 split, you could certainly call my bluff and even bring me back up to 10-90 or so.

How you present those arguments is a matter of style. Here are four different ways of presenting the same argument (you will not get more than $30k for an APL job) :

  1. No one else in town hires APL developers like yourself, so you either accept this $30k job or get out.
  2. I am certainly interested in your APL skills, but I would be losing money if I paid you more than $30k.
  3. We are looking to fill our APL development position, but I am not at liberty to offer you more than $30k.
  4. We have interviewed several candidates with your skills, and most of them are below your wage requirements.

Some aggressive types would prefer approach 1, while manipulator types would go with 2 (which is a lie), passive types would hide behind a third party authority using 3, and sneaky types would try to get an even lower salary quote by withholding the $30k figure in approach 4.

Alternative Benefits

This is a subtle but essential part of many successful negotiations. It’s not only about the money. In the above example, I traded 5% for the ability to get my food in time for lunch — a concept that had not been mentioned at all before. You decided that instead of matching my requirements as I expressed them, you would offer me something that was unexpected and which, to you, cost less than 5% but still, to me, was worth at least 5%.

Going down alternate routes is an excellent way to get a stuck negotiation moving.

Article Image © William Neuheisel — Flickr



1342 feed subscribers
(readers who polled a feed this week)