<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Nicollet.Net &#187; Design Patterns</title>
	<atom:link href="http://www.nicollet.net/toroidal/patterns/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.nicollet.net</link>
	<description>Everyone Loves Me</description>
	<lastBuildDate>Mon, 23 Jan 2012 16:55:59 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=</generator>
		<item>
		<title>OCaml Submodule Pattern</title>
		<link>http://www.nicollet.net/2012/01/ocaml-submodule-pattern/</link>
		<comments>http://www.nicollet.net/2012/01/ocaml-submodule-pattern/#comments</comments>
		<pubDate>Mon, 23 Jan 2012 16:55:59 +0000</pubDate>
		<dc:creator>Victor Nicollet</dc:creator>
				<category><![CDATA[Functional]]></category>
		<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Design Patterns]]></category>
		<category><![CDATA[Objective Caml]]></category>

		<guid isPermaLink="false">http://www.nicollet.net/?p=2660</guid>
		<description><![CDATA[My code is quite large for an OCaml project. The main RunOrg repository alone contains 46212 lines of OCaml code (plus an additional 5631 lines of OCaml mli files) — and then, there&#8217;s the web framework code and the independent plugins code. It&#8217;s is Better™ to have many short files than a few long ones. [...]]]></description>
			<content:encoded><![CDATA[<p>My code is quite large for an OCaml project. The main RunOrg repository alone contains 46212 lines of OCaml code (plus an additional 5631 lines of OCaml mli files) — and then, there&#8217;s the web framework code and the independent plugins code.</p>
<p>It&#8217;s is Better™ to have many short files than a few long ones. One reason is incremental compiling with <em>ocamlbuild</em> : that the smaller your files are, the smaller the percentage of code to be compiled when you make a small change. Another reason is that files provide a natural delineation of code that makes it slightly easier to reason about.</p>
<p>The very process of splitting a large file into smaller files is also an excellent way to clean up the code. Every split is an opportunity to move some code to a more generic location — why have a <code>CMember_importParser</code> module when all of its functionality could fit into an <code>OzCsv</code> plugin module ? Even when no such generic solution exists, cutting through the jungle that a 2000-line module contains helps clean up dependencies, identify shared functionality and imagine better ways to design code.</p>
<p>Still, when cutting up code this way, the problem of encapsulation remains. If code that relates to pictures (an upload module, a transform module, a download module, an access rights module) is split across several files, it is desirable to let each file access functions and values from other values that would not otherwise be shown to modules not related to picture processing. For instance, a <code>get_download_link</code> function should be available throughout all picture-related modules, but the rest of the application should use the <code>get_download_link_for_user</code> function that checks whether the user is allowed to download the file.</p>
<p>In order to achieve several nested levels of encapsulation required to work with modules this way, I have come up with a convention :</p>
<ul>
<li>A module name (and thus, a file name) is composed of segments written in camelCase and separated by underscores. For instance, <code>CEntity_view_grid</code> is a module name containing segments <code>CEntity,</code> <code>view</code> and <code>grid</code>.</li>
<li>Modules with only one segment are public. Any other module may include, open or otherwise reference them with no limitations beyond what the module signature says. So, <code>CEntity</code> may access <code>MGroup</code> freely.</li>
<li>Modules with N &gt; 1 segments are private. They may only be accessed by modules which share the first N-1 segments. So, <code>CEntity_view</code> is available to modules <code>CEntity</code> and <code>CEntity_edit</code> but not <code>CPicture</code>.</li>
<li>A module with N segments may export any module with N+1 segments it can access, possibly under a more restrictive signature. For instance, <code>CEntity_view</code> is available to all other modules as <code>CEntity.View</code>.</li>
</ul>
<p>To make these rules easier to respect, private module dependencies are made explicit by adding a list of module aliases at the top of each file. The top of my <code>cEntity_view.ml</code> file starts with :</p>
<pre style="padding-left: 30px;"><code>module Sidebar     = CEntity_sidebar
module Unavailable = CEntity_unavailable
module Edit        = CEntity_edit
module Info        = CEntity_view_info
module Directory   = CEntity_view_directory
module Grid        = CEntity_view_grid
module Wall        = CEntity_view_wall
</code></pre>
<p>It is forbidden to use a private module without going through such an alias, and it is forbidden to define such an alias anywhere except at the top of the file. This makes it extremely easy to determine whether private access rules are respected.</p>
<p>The rule of thumb for splitting files (in my particular coding style) is :</p>
<ul>
<li>Code for separate layers (model, view, controller&#8230;) go into separate public modules.</li>
<li>For complex code (such as complex rules in model or controller code), consider splitting files larger than 200 lines.</li>
<li>For simple code (such as HTML template or JSON serialization definitions), there is no splitting limit except for factoring out common behavior.</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.nicollet.net/2012/01/ocaml-submodule-pattern/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Basic Patterns for Everyday Programming</title>
		<link>http://www.nicollet.net/2011/11/basic-patterns-for-everyday-programming/</link>
		<comments>http://www.nicollet.net/2011/11/basic-patterns-for-everyday-programming/#comments</comments>
		<pubDate>Wed, 23 Nov 2011 15:48:19 +0000</pubDate>
		<dc:creator>Victor Nicollet</dc:creator>
				<category><![CDATA[Functional]]></category>
		<category><![CDATA[Algorithms]]></category>
		<category><![CDATA[Design Patterns]]></category>
		<category><![CDATA[Objective Caml]]></category>

		<guid isPermaLink="false">http://www.nicollet.net/?p=2624</guid>
		<description><![CDATA[Lakshen Perera provides a list of basic patterns for everyday programming, illustrated in Javascript and Ruby. I thought it would be interesting to provide an OCaml illustration as well, and perhaps a handful of additional patterns as well. Verify object&#8217;s availability before calling its methods or properties In many languages, there is a possibility for [...]]]></description>
			<content:encoded><![CDATA[<p><img class="aligncenter size-full wp-image-2635" title="pattern" src="http://www.nicollet.net/wp-content/uploads/2011/11/pattern.png" alt="" width="675" height="100" /></p>
<p><a href="http://laktek.com/2011/11/23/basic-patterns-for-everyday-programming/" target="_blank">Lakshen Perera</a> provides a list of basic patterns for everyday programming, illustrated in Javascript and Ruby. I thought it would be interesting to provide an OCaml illustration as well, and perhaps a handful of additional patterns as well.</p>
<h4>Verify object&#8217;s availability before calling its methods or properties</h4>
<p>In many languages, there is a possibility for objects to be missing — whether this is represented as <code>NULL</code>, <code>null</code>, <code>nil</code> or <code>None</code>. Regardless of the language, it is important to keep in mind at all times whether a given value is optional or not. If it is not optional, then you may assert so using a type declaration if your language supports it :</p>
<pre style="padding-left: 30px;"><strong>method</strong> name  : string <span style="color: #008000;">(* not optional *)</span>
<strong>method</strong> email : string option <span style="color: #008000;">(* optional *)</span></pre>
<p>In other languages, a runtime assert will do, for instance in PHP :</p>
<pre style="padding-left: 30px;"><strong>assert</strong> (isset($this-&gt;name))</pre>
<p>OCaml will not require you to explicitly declare all values as optional or mandatory. Instead, it will deduce that information from the way you use each value. For instance, since <code>json_of_string</code> expects a non-optional string argument, you can simply write :</p>
<pre style="padding-left: 30px;"><strong>let</strong> parsed_content = json_of_string json <span style="color: #008000;">(* json is not optional *)</span></pre>
<p>If json is an optional string, this will create a compilation error unless you explicitly define what should happen when the string is missing. The most common approach is to decide that the result should be missing too :</p>
<pre style="padding-left: 30px;"><strong>let</strong> parsed_content = <strong>match</strong> json <strong>with</strong>
  | <span style="color: #008080;">None</span>   -&gt; <span style="color: #008080;">None</span>
  | <span style="color: #008080;">Some</span> s -&gt; <span style="color: #008080;">Some</span> (json_of_string s) <span style="color: #008000;">(* by definition, s is not optional *)</span></pre>
<p>If using the Batteries library (which, by the way, you should), you can express this more easily :</p>
<pre style="padding-left: 30px;"><strong>let</strong> parsed_content = <span style="color: #008080;">BatOption</span>.map json_of_string json</pre>
<h4>Set a default value with assignments</h4>
<p>This advice is almost identical to the previous one : if you need to construct a non-optional value, but only have access to an optional value, you will have to provide a default value. Using standard OCaml :</p>
<pre style="padding-left: 30px;"><strong>let</strong> role = <strong>match</strong> person.role <strong>with</strong>
  | <span style="color: #008080;">None</span>      -&gt; <span style="color: #008080;">`Guest</span>
  | <span style="color: #008080;">Some</span> role -&gt; role</pre>
<p>Using Batteries, there is an almost equivalent version :</p>
<pre style="padding-left: 30px;"><strong>let</strong> role = <span style="color: #008080;">BatOption</span>.default <span style="color: #008080;">`Guest</span> person.role</pre>
<p>This version is almost equivalent, because it will evaluate the default value even if it is not required. Let&#8217;s assume that the default value is itself provided by a complex computation, such as a database access :</p>
<pre style="padding-left: 30px;"><strong>let</strong> role = <strong>match</strong> person.role <strong>with</strong>
  | <span style="color: #008080;">None</span>      -&gt; readfrom database <span style="color: #008000;">(* only executed if person.role is missing *)</span>
  | <span style="color: #008080;">Some</span> role -&gt; role 

<strong>let</strong> role = <span style="color: #008080;">BatOption</span>.default
  (readfrom database) <span style="color: #008000;">(* Always executed, even if unnecessary *)</span>
  role</pre>
<p>The Batteries library provides an alternate function for that :</p>
<pre style="padding-left: 30px;"><strong>let</strong> role = <span style="color: #008080;">BatOption</span>.map_default
  readfrom database <span style="color: #008000;">(* only executed if person.role is missing *)</span>
  role</pre>
<h4>Checking whether a variable equals to any of the given values</h4>
<p>If the values have a legitimate reason to be strings, integers or other types with unlimited numbers of values, then « is in list » predicates are the preferred choice :</p>
<pre style="padding-left: 30px;"><strong>if </strong><span style="color: #008080;">List</span>.mem current_day [<span style="color: #ff0000;">"Monday"</span>;<span style="color: #ff0000;">"Wednesday"</span>;<span style="color: #ff0000;">"Friday"</span>] <strong>then</strong>
  <span style="color: #008000;">(* do something *) </span></pre>
<p>Days are a quite bad example, because these are better represented as a variant (which then type-checks whether you have written<em> Mornday</em> instead of <em>Monday</em>). If you have already defined the type of weekdays, then :</p>
<pre style="padding-left: 30px;"><strong>if</strong> <span style="color: #008080;">List</span>.mem (current_day : weekday) [<span style="color: #008080;">`Monday</span>;<span style="color: #008080;">`Wednesday</span>;<span style="color: #008080;">`Friday</span>] <strong>then</strong>
  <span style="color: #008000;">(* do something *)</span></pre>
<p>If this is a one-shot check, or if you would rather not define a weekday type yet, you should instead go for an exhaustive pattern-matching :</p>
<pre style="padding-left: 30px;"> <strong>match</strong> current_day <strong>with</strong>
  | <span style="color: #008080;">`Monday</span> | <span style="color: #008080;">`Wednesday</span> | <span style="color: #008080;">`Friday</span> -&gt;
    <span style="color: #008000;">(* do something *)</span>
  | <span style="color: #008080;">`Tuesday</span> | <span style="color: #008080;">`Thursday</span> | <span style="color: #008080;">`Saturday</span> | <span style="color: #008080;">`Sunday</span> -&gt; ()</pre>
<h4>Extract complex or repeated logic into functions</h4>
<p>This is a fairly fundamental concept — but it consists in two distinct parts. There is a separation and naming part (you pull out a piece of code and give it a name, which helps understand what it does and how it relates to the rest of the program) and there is an extraction and reuse part (the piece of code is pulled out into a more globally accessible location and parametrized, so that it may be used in other places).</p>
<p>With the above example, simple separation-and-naming would be :</p>
<pre style="padding-left: 30px;"><strong>let </strong>is_discount_day = <strong>match </strong>current_day <strong>with</strong>
 | <span style="color: #008080;">`Monday</span> | <span style="color: #008080;">`Wednesday</span> | <span style="color: #008080;">`Friday</span> -&gt; current_date &gt; <span style="color: #ff0000;">20</span>
 | <span style="color: #008080;">`Tuesday</span> | <span style="color: #008080;">`Thursday</span> | <span style="color: #008080;">`Saturday</span> | <span style="color: #008080;">`Sunday</span> -&gt; <span style="color: #ff0000;">false</span>
<strong>in</strong>

<strong>if </strong>is_discount_day <strong>then </strong>
  <span style="color: #008000;">(* do something *)</span></pre>
<p>The variable is defined in the same scope it is use in, and it assumes that <code>current_day</code> and <code>current_date</code> values have been defined previously in that scope. Extraction-and-reuse would go further :</p>
<pre style="padding-left: 30px;"><strong>type </strong>weekday =
  [ <span style="color: #008080;">`Monday</span> | <span style="color: #008080;">`Tuesday</span> | <span style="color: #008080;">`Wednesday</span>
  | <span style="color: #008080;">`Thursday</span> | <span style="color: #008080;">`Friday</span> | <span style="color: #008080;">`Saturday</span> | <span style="color: #008080;">`Sunday</span> ]
<strong>
let </strong>is_discount_day (day:weekday) date =
<span style="color: #008080;">  List</span>.mem day [<span style="color: #008080;">`Monday</span>;<span style="color: #008080;">`Wednesday</span>;<span style="color: #008080;">`Friday</span>] <strong>&amp;&amp;</strong> date &gt; <span style="color: #ff0000;">20</span>

...

  <strong>if </strong>is_discount_day current_day current_date <strong>then </strong>
    <span style="color: #008000;">(* do something *)</span></pre>
<p>Now, is-discount-day is a global function available from everywhere in the code, and it uses the provided parameters to determine whether this is indeed a discount day.</p>
<h4>Memoize the results of repeated function calls</h4>
<p>OCaml has several ways to perform memoization. One of them is lazy evaluation :</p>
<pre style="padding-left: 30px;"><strong>val</strong> discount_day = <strong>lazy</strong> (is_discount_day current_day current_date)
<strong>method</strong> discount_day = <span style="color: #008080;">Lazy</span>.force discount_day</pre>
<p>The lazy expression will only be evaluated the first time the <code>Lazy.force</code> function is called on it.</p>
<p>Note that if the current day or current date can change, then the memoization actually <em>breaks</em> things !</p>
<p>Memoization is also helpful when dealing with a function that requires arguments, in which case a different result will be provided for each argument set. A common solution is to use a hash table to store these :</p>
<pre style="padding-left: 30px;"><strong>let</strong> fibonacci =
  <strong>let</strong> memo = <span style="color: #008080;">Hashtbl</span>.create <span style="color: #ff0000;">100</span> <strong>in</strong>
  <strong>let rec</strong> fib n =
    <strong>try</strong> <span style="color: #008080;">Hashtbl</span>.find memo n <strong>with </strong><span style="color: #008080;">Not_found</span> -&gt;
      <strong>let</strong> result = fib (n-<span style="color: #ff0000;">1</span>) + fib (n-<span style="color: #ff0000;">2</span>) <strong>in</strong>
      <span style="color: #008080;">Hashtbl</span>.add memo n result ; result
  <strong>in</strong> fib</pre>
<p>This works fine for short-lived functions — don&#8217;t do this for global functions that might stick around for a long time, because the memoization hash table will grow and its contents will never be garbage-collected. If you really have to, use a <em>weak</em> hash table, such as Batteries&#8217; <code>BatInnerWeaktbl</code>, so that the garbage collector may reclaim the memoized values when it runs out of memory.</p>
<p>Also don&#8217; t overdo memoization — it only works when arguments are reliably passed more than once <em>and</em> the time to compute the value is significantly larger than the time to retrieve and store it <em>and</em> it is worth the memory usage <em>and </em>the function has no side-effects.</p>
<h4>Use the seven list manipulation primitives</h4>
<p>Almost any processing on collections of items can be expressed in terms of seven fundamental patterns. Recognizing those patterns can help improve the clarity of both the code and the underlying algorithm.</p>
<p><strong>1. Map</strong> transforms a list into another, item by item, in linear time. Use a map operation when all you need is a one-to-one transformation. The line below extracts three recipes from the database using their identifier.</p>
<pre style="padding-left: 30px;"><strong>let</strong> recipes = <span style="color: #008080;">List</span>.map from_database [ <span style="color: #ff0000;">"omelet"</span> ; <span style="color: #ff0000;">"cheeseburger"</span> ; <span style="color: #ff0000;">"risotto"</span> ]</pre>
<p><strong>2. Reduce</strong> transforms a list of values into a single value by repeatedly applying a function that combines together two values into one. The typical example is a fold, which uses a function to combine each list element, in turn, with an accumulator. It can be used to extract the sum of values in a list, for example :</p>
<pre style="padding-left: 30px;"><strong>let</strong> total = <span style="color: #008080;">List</span>.fold_left (+) <span style="color: #ff0000;">0</span> [ <span style="color: #ff0000;">5</span> ; <span style="color: #ff0000;">6</span> ; <span style="color: #ff0000;">3</span> ; <span style="color: #ff0000;">8</span> ; <span style="color: #ff0000;">9</span> ; <span style="color: #ff0000;">0</span> ; <span style="color: #ff0000;">7</span> ; <span style="color: #ff0000;">6</span> ]</pre>
<p>This transform allows a preliminary map step which transform the values inside the list into values that can be combined. For instance, to find the age of the oldest person in a list of people :</p>
<pre style="padding-left: 30px;"><strong>let</strong> oldest = <span style="color: #008080;">List</span>.fold_left (<strong>fun</strong> acc person -&gt; max age person.age) <span style="color: #ff0000;">0</span> people</pre>
<p><strong>3. Extract </strong>works like a map, but the transformation function returns zero, one or more results for each call. All the results are included in the final list. The most elementary implementation is literally to have a map (that transforms a list into a list of lists) followed by a concatenation (that transforms a list of lists into a list). For instance, to get all the ingredients involved in a list of recipes :</p>
<pre style="padding-left: 30px;"><strong>let </strong>ingredients =
  <span style="color: #008080;">List</span>.concat (<span style="color: #008080;">List</span>.map (<strong>fun</strong> recipe -&gt; recipe.ingredients) recipes)</pre>
<p><strong>4. Filter</strong> is a subset of Extract where the transform may not return more than one result — but it may still return none, so its result is simply an optional type. For instance, to extract the list of all recipes that have a wine recommandation along with their recommended wine :</p>
<pre style="padding-left: 30px;"><strong>let</strong> wines = <span style="color: #008080;">BatList</span>.filter_map
  (<strong>fun</strong> recipe -&gt; <span style="color: #008080;">BatOption</span>.map (<strong>fun</strong> wine -&gt; recipe,wine) recipe.wine) recipes</pre>
<p>The OCaml language also provides a standard <code>List.filter</code> function which keeps values for which a property is true. For instance, to get the list of recipes that have a wine recommendation :</p>
<pre style="padding-left: 30px;"><strong>let</strong> have_wines = <span style="color: #008080;">List</span>.filter (<strong>fun</strong> recipe -&gt; recipe.wine &lt;&gt; <span style="color: #008080;">None</span>) recipes</pre>
<p>This approach is weaker — the wines in the resulting list are still treated as optional by the type system, so you will need bogus pattern matching for a case that never happens (no wine) to extract the actual wine values. A <code>filter_map</code> lets you encode the filter property in the type of the result, which makes using the filtered list easier.</p>
<p><strong>5. Sort</strong> unsurprisingly sorts the list. The canonical sort — using the canonical order relationship — is a theoretical curiosity, and in practice most sorts use a <em>projection function</em> p such that A &lt; B iff p(A) &lt; p(B). This is best illustrated in SQL, in the form of the ORDER BY &lt;projection&gt; statement. Two useful helper functions :</p>
<pre style="padding-left: 30px;"><strong>let</strong> project compare p a b = compare (p a) (p b)
<strong>let</strong> descending compare a b = compare b a</pre>
<p>For instance, to sort the list of recipes based on how long each of them takes :</p>
<pre style="padding-left: 30px;"><strong>let</strong> by_duration =
  <span style="color: #008080;">List</span>.sort (project compare (<strong>fun</strong> recipe -&gt; recipe.duration)) recipes</pre>
<p><strong>6. Group</strong> works like Sort, but further regroups « equal » items together by returning a list of lists. It works using a comparison function that is usually based on a projection function. For instance, to get three lists containing one-star, two-star and three-star recipes :</p>
<pre style="padding-left: 30px;"><strong>let</strong> by_stars =
  <span style="color: #008080;">BatList</span>.group (project compare (<strong>fun</strong> recipe -&gt; recipe.stars)) recipes</pre>
<p>There is a special case when the projection function returns booleans, which is known as a <em>partition</em>. For instance, to extract recipes that are desserts and recipes that are not :</p>
<pre style="padding-left: 30px;"><strong>let</strong> desserts, non_desserts =
<span style="color: #008080;">  BatList</span>.partition (<strong>fun</strong> recipe -&gt; recipe.is_dessert) recipes</pre>
<p><strong>7. Search</strong> extracts one element from the list (if possible) based on a certain condition or property. Elementary searches are « first element » and « last element. » More complex searches : finding a value by key in a list of key-value pairs using <code>List.assoc</code> or using a predicate using <code>List.find</code>. The heavy-duty search tool is <code>BatList.find_map</code>, used below to find a recipe that is recommended by at least one person, and the recommending person :</p>
<pre style="padding-left: 30px;"><strong>let </strong>recipe, recommender = <span style="color: #008080;">BatList</span>.find_map
  (<strong>fun</strong> recipe -&gt; <span style="color: #008080;">BatOption</span>.map (<strong>fun </strong>person -&gt; recipe, person) recipe.recommended)
  recipes</pre>
<p>These seven patterns can be used in conjunction to perform almost any algorithm on collections, sequences or lists. For a more complex example, assume we need the ten ingredients that appear the most often in desserts. We would filter recipes by desserts, extract their ingredients, group them by name, sort the sub-lists by list length and take the name of the first element of the first ten sub-lists :</p>
<pre style="padding-left: 30px;"><strong>open </strong><span style="color: #008080;">BatPervasives</span> <span style="color: #008000;">(* for operator |&gt; *)</span>
<strong>open </strong><span style="color: #008080;">BatList
</span>
<strong>let</strong> ten_best_ingredients recipes = recipes
  |&gt; filter (<strong>fun</strong> r -&gt; r.dessert)
  |&gt; map (<strong>fun</strong> r -&gt; r.ingredients) |&gt; concat
  |&gt; group (project compare (<strong>fun</strong> i -&gt; i.name))
  |&gt; sort (descending (project compare length))
  |&gt; take <span style="color: #ff0000;">10</span>
  |&gt; map hd
  |&gt; map (<strong>fun</strong> i -&gt; i.name)</pre>
<p><small>Article &copy; brewbooks &mdash; <a href="http://www.flickr.com/photos/brewbooks/3203211847/">Flickr</a></small></p>
]]></content:encoded>
			<wfw:commentRss>http://www.nicollet.net/2011/11/basic-patterns-for-everyday-programming/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Comment Branches</title>
		<link>http://www.nicollet.net/2011/11/comment-branches/</link>
		<comments>http://www.nicollet.net/2011/11/comment-branches/#comments</comments>
		<pubDate>Thu, 17 Nov 2011 18:42:36 +0000</pubDate>
		<dc:creator>Victor Nicollet</dc:creator>
				<category><![CDATA[Functional]]></category>
		<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Design Patterns]]></category>
		<category><![CDATA[Productivity]]></category>

		<guid isPermaLink="false">http://www.nicollet.net/?p=2619</guid>
		<description><![CDATA[Your development job is making changes in your software. Writing, testing and debugging those changes takes some time. If your job is anywhere as hectic as mine, you will have to fix and deploy urgent patches, even when your application code is in a half-written, half-debugged state because of the feature of the month. This is [...]]]></description>
			<content:encoded><![CDATA[<p><img class="aligncenter size-full wp-image-2620" title="branches" src="http://www.nicollet.net/wp-content/uploads/2011/11/branches.png" alt="" width="675" height="100" />Your development job is making changes in your software. Writing, testing and debugging those changes takes some time.</p>
<p>If your job is anywhere as hectic as mine, you will have to fix and deploy urgent patches, even when your application code is in a half-written, half-debugged state because of <em>the feature of the month</em>.</p>
<p>This is what <em>branches</em> are for. You keep two versions of the code, one of which is called the <strong>trunk </strong>and is always ready for deployment, and another which holds the changes that you are working on.</p>
<p>When your feature is done, you <em>merge</em> the two versions together. You want to keep the merge operation painless. To do so, you have several kinds of branches available.</p>
<p>The <strong>repository branch</strong> is built into your SourceSafe/subversion/git/whatever. It creates two independent copies, and you need to migrate changes from the trunk to every branch out there as soon as possible, or the merge will make you wish for a sweet and merciful death.</p>
<p>By the way, changeset-oriented tools (like git or mercurial) make this easer, while revision-oriented tools (like subversion) make it harder.</p>
<p>The <strong>feature branch</strong> is done using programming logic. The code you deploy to production supports the new feature, but it is turned off for everyone except yourself. This technique is great for adding features, but inefficient when changing existing ones.</p>
<p>A side effect of the feature branch is that you can stress-test new code by rolling it out to increasing numbers of users progressively.</p>
<p>The <strong>comment branch</strong> is an odd gambit. It involves ripping out an entire module and replacing it with another that has a <em>different</em> interface. This will involve large amounts of re-wiring all over the code base, and these will take hours or days before they can be compiled, let alone <em>tested</em>.</p>
<p>Use a comment structure such as this one:</p>
<pre style="padding-left: 30px;"><span style="color: #008000;">/*[*/</span> old code <span style="color: #008000;">/*|* new code *]*/</span></pre>
<p>It is trivial to build a text-replacement macro that turns the above into the code below and back:</p>
<pre style="padding-left: 30px;"><span style="color: #008000;">/*[* old code *|*/</span> new code <span style="color: #008000;">/*]*/</span></pre>
<p>Use the macro to switch between development mode (when you write new code and desperately try to get it to compile) and fix mode (when you edit the old code and deploy it). For consistency, always commit the <em>old </em>version to the repository.</p>
<p>Why use <strong>comment branches</strong> instead of <strong>repository branches</strong> ? Maybe your source control tool sucks at branches. I use Subversion. Yes, I know. Legacy, pain and unlikely hopes of a brighter future.</p>
<p>When a trunk change occurs in a part that has been erased or reworked in the branch, that change <em>will</em> cause a conflict that <em>will</em> require manual intervention. Even with git or mercurial. For a large number of small changes sprinkled over a large codebase that is routinely involving many small updates, repository branches turn into a merge minefield.</p>
<p>Does your branch involve a small number of well-defined files ?</p>
<p>Then you should use <strong>repository branches</strong>, because conflicts will only happen in those files, and will usually be easy to fix.</p>
<p>Does your branch involve many changes in many files everywhere in the project ?</p>
<p>Then use <strong>comment branches</strong>.</p>
<p>Last and possibly least, there is the <strong>TODO-branch</strong>. This involves non-breaking, purely cosmetic changes. 25% of my project uses this syntax for historical reasons:</p>
<pre style="padding-left: 30px;">Table.get id |-&gt; function
   | None       -&gt; return 0
   | Some value -&gt; return value.count</pre>
<p>Then, a convention change happened, and this is used instead:</p>
<pre style="padding-left: 30px;">let! value_opt = breathe (Table.get id) in
match value_opt with  
   | None       -&gt; return 0
   | Some value -&gt; return value.count</pre>
<p>Then, another convention change happened, and this should be used instead</p>
<pre style="padding-left: 30px;">let! value = breathe_req_or (return 0) (Table.get id) in
return value.count</pre>
<p>And then, there&#8217;s the current version:</p>
<pre style="padding-left: 30px;">let! value = breathe_req_or (return 0) $ Table.get id in
return value.count</pre>
<p>Whenever I change coding conventions, I do not spend the time to reformat the tens of thousands of lines of code in my application. That would have been wasteful. Instead, every time a piece of code is refactored, it is refactored to the most recent style.</p>
<p>The same happens when using an old and a new version of a given API. My code uses two libraries for handling HTML forms, uses both Javascript and Coffeescript, and a variety of similar two-hammers-one-nail situations.</p>
<p>These are, for all practical purposes, branches. They are work that is being performed for long durations. The benefit of TODO-branches is that code in the middle of such changes is still compatible with the trunk. It all happens in the head of the developer, who remembers what changes should be done the next time a piece of code is rewritten.</p>
<p><small>Article Image &copy; Dominic Alves &mdash; <a href="http://www.flickr.com/photos/dominicspics/422131893/">Flickr</a></small></p>
]]></content:encoded>
			<wfw:commentRss>http://www.nicollet.net/2011/11/comment-branches/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Object-Oriented For the Win.</title>
		<link>http://www.nicollet.net/2011/04/object-oriented-for-the-win/</link>
		<comments>http://www.nicollet.net/2011/04/object-oriented-for-the-win/#comments</comments>
		<pubDate>Fri, 01 Apr 2011 21:26:03 +0000</pubDate>
		<dc:creator>Victor Nicollet</dc:creator>
				<category><![CDATA[Imperative]]></category>
		<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Design Patterns]]></category>
		<category><![CDATA[Learning]]></category>

		<guid isPermaLink="false">http://www.nicollet.net/?p=2335</guid>
		<description><![CDATA[Readers, beware: this is an opinionated piece. Feel free to curse appropriately in the comments below, should you find the need to do so. I&#8217;m writing this because of yet another encounter with an object-oriented zombie, and one without the excuse of being fresh out of school. They all stand united behind the motto that [...]]]></description>
			<content:encoded><![CDATA[<p>Readers, beware: this is an opinionated piece. Feel free to curse appropriately in the comments below, should you find the need to do so. I&#8217;m writing this because of yet another encounter with an object-oriented zombie, and one without the excuse of being fresh out of school. They all stand united behind the motto that <strong>Object-oriented programming makes your code easy to reuse, debug, maintain and extend</strong> and I happen to disagree.</p>
<p>Something interesting happened in the Pacific Ocean after World War II: during the war, American soldiers had set up air strips on Pacific islands, complete with little control towers that had radio stations. And the natives observed that those soldiers would speak into a weird piece of metal, and a giant iron bird would land from the sky and deliver food and supplies. Then, the war ended, the soldiers went home, and the natives tried to bring the giant iron birds back by building shacks that looked like legitimate control towers, and spoke strange words into rocks that looked like legitimate microphones, and it worked!</p>
<p><a href="http://www.flickr.com/photos/frankjuarez/461208642/"><img class="alignright size-full wp-image-2337" style="margin-left: 15px;" title="flower-1" src="http://www.nicollet.net/wp-content/uploads/2011/04/flower-1.jpg" alt="" width="100" height="100" /></a>No, it didn&#8217;t. Cargo planes did not come back because it was not the shape of the microphone and antenna that brought in the planes, it was their invisible electromagnetic properties. These are the <em>cargo cults</em>: imitating the outside appearance of things that work and expecting the imitation to work as well. The same happens with object-oriented programming — the use of classes, or inheritance, or design patterns is not the reason why good code is good: they are merely tools which happened to be applied by knowledgeable and skilled programmers in order to avoid problems that they knew would happen if they did not preempt their appearance. To look at their code, find out that they used classes, and deduce that classes were responsible for their success is about as rational as expecting to steal the strength of your defeated foe when you eat his flesh. <strong>Object-oriented programming can only solve design problems you know you need to solve.</strong></p>
<p>Imagine a piece of old, spaghetti C code written in procedural style, with functions calling each other and accessing global variables all over the place. I suspect many young graduates these days never had to deal with these — and I wouldn&#8217;t have either, were it not for my unrequited love for mental anguish and programmer pain. Back to the point: in your mind, draw an arrow from A to B if function B calls function A or accesses global variable A. This is a dependency graph: if you were to change the run-time behavior of A, then the behavior of B would almost certainly change as well. By following all outgoing arrows from the point you altered, you can find every element that could possibly be affected by it.</p>
<p>No programming style or design methodology on earth is going to change this ; whether object-oriented or aspect-oriented or functional, functions call other functions, and changing one part will have an impact on many other parts. That is something we accept, because propagating changes is part of the programming job, and we have even elevated refactoring — the propagation of the <em>absence</em> of changes — to the level of a good thing since sliced bread.</p>
<p><a href="http://www.flickr.com/photos/tanaka_juuyoh/2585559389/"><img class="alignleft size-full wp-image-2341" style="margin-right: 15px;" title="flower-2" src="http://www.nicollet.net/wp-content/uploads/2011/04/flower-2.jpg" alt="" width="180" height="240" /></a>And since we have to track down all the dependencies of a given entity sooner or later, our national sport is to make it as easy as possible. One way is to let the compiler or build process help — compilers detect type mismatches that are usually a hint of non-propagated major behavior changes, while automated unit tests and regression tests handle more subtle changes, and both of these are fairly independent of your programming style. Another way is to willingly reduce the number of dependencies through architectural constraints — Model-View-Controller prevents a model from being dependent on a controller, because altering a controller should not change the behavior of a model. These are fairly common ways to reduce static dependencies.</p>
<p>Then, there are the dynamic dependencies — functions which alter the behavior of other functions at <em>run-time</em>. Remember those global variables from the C program? If function A writes to the global variable and function B reads from the global variable, then calling A will probably change the behavior of B. Runtime dependencies are fairly obvious to map: every time a function writes to a global variable, draw an arrow from the function to the global variable and mark the function as having a side-effect. Then recursively do the same for every function that calls it, and so on until you run out of functions. Your dependency graph just became a lot hairier than it was before.</p>
<p>Such dependencies are the reason why working on the report-printing feature in your invoicing software somehow managed to break the database storage : even though both modules are independent and the static dependency tree for report-printing does not flow into the static dependency tree for database storage, there are a handful of global variables that allow your changes to follow dynamic dependencies and leave a flag unset when it should have been set and ultimately blow up the nuclear silo.</p>
<p>So, when you have global mutable state, then changes can jump from module to module through those pieces of global state. Procedural programming mostly dealt with this by cleanly wrapping up any global state in a module, and having modules interact with each other by means of interfaces that managed to guarantee some invariants on the global state. But still, some problems remained. I am reminded here of the code to Dungeon Crawl, an open-source game written in procedural style, where the module responsible for dealing damage to the player will make frequent calls to the module responsible for outputting messages about the damage being dealt (&#8220;The hobgoblin hits you!&#8221;). As written, you cannot use the damage-dealing code without displaying messages, so writing an AI module that tries to predict the outcome of an attack is harder than it seems, <em>because it cannot use the existing damage-dealing code to run predictions</em>.</p>
<div id="attachment_2343" class="wp-caption aligncenter" style="width: 360px"><a href="http://www.dungeoncrawl.org/"><img class="size-full wp-image-2343 " title="ss-dos-sm" src="http://www.nicollet.net/wp-content/uploads/2011/04/ss-dos-sm.png" alt="" width="350" height="250" /></a><p class="wp-caption-text">They actually have sludge elves.</p></div>
<p>Functional programming solved the global mutable state issue by eliminating global mutable state altogether. In the aforementioned example, the damage-dealing module would return the new damaged player along with a list of messages that were generated, and these messages could then be discarded silently in the AI subroutine, or forwarded to the screen in the main routine. I am a big fan of functional programming, but let&#8217;s face it: not only is it harder on the brain than good old set-variable-values-everywhere programming, but it does not suit well to situations which have an implicit reliance on global mutable state — such as web servers connected to a persistent database. I look forward to a pure functional friendly database, but it&#8217;s not there yet.</p>
<p>Both of these solutions rely on <em>encapsulation</em>: whatever happens to be the implementation of a procedural or pure functional module is hidden, and only available through those functions. This creates a dependency bottleneck, so any changes performed within the implementation can only leak out through the interface in tightly controlled ways &mdash; at compile-time, those are called contracts, at run-time they are called invariants. </p>
<p>Object-oriented programming works by deprecating global mutable state. It&#8217;s not completely gone, and some programmers yearning for the good old days still try to enjoy some global goodness by using the Gang of Four Singleton pattern, but it&#8217;s mostly deprecated. The sickle and hammer of object-oriented programming are dependency injection and late binding, and it uses them to break down the structure of static dependency graph bourgeoisie.</p>
<p>Yes, object-oriented programming is all about building the dependency graph at run-time. Then again, a program that generates machine code at run-time does the same. Object-oriented programming merely provides a set of tools that are easier on the human brain than the extremes of generating machine code at run-time. As an aside, keep in mind that it is not the only toolbox available to you: any language with closures can do the same, and although closures are both more elegant and slightly more complex to handle in the general case, they have found their way into event-based programming because an event handler is just, well, a closure. This is why jQuery passes functions around everywhere instead of requiring the user to implement an IAjaxResultVisitor interface: it&#8217;s shorter, but still fairly easy to understand in that context. If you push things far enough, most objects are usually nothing more than dictionaries of closures.</p>
<p>Back to the point: writing a proper object-oriented program with those tools is about the same as creating a statue out of stone using sculpting tools, and inheritance will not turn someone into an object expert any more than holding a chisel turns them into a mutant ninja turtle disambiguation page. How does it happen?</p>
<p>The original problem with procedural programming was that there was no easy way to unplug the damage-dealing module from the message-printing module. Dependency injection means that the damage-dealing module (which is now an object) does not know about the message-printing module (which is now also an object), instead, whoever needs to use damage-dealing code must provide it with a message-printing object that will be used to print whatever messages come up. That message-printing object might actually print those messages to the screen, or it might silently discard them, or it might keep them around and only display them to the screen if a specific event happens, or it might be an unit testing mock, or any other amount of different behaviors. In terms of dependency graphs, there is no static connection between damage-dealing and message-printing: the program decides at run-time what message-printing object should be bound to what damage-dealing object, and can create new objects on the fly should the situation require it.</p>
<p>Aside from that, there are no significant architectural benefits to using objects or classes &mdash; although touted as an object-oriented achievement, <em>encapsulation</em> is commonly available in functional and procedural programming as well, and using <code>object.function()</code> instead of <code>function(object)</code> is a matter of taste, not architecture. </p>
<p>As a final word, I am fairly doubtful of the ability of schools to teach their students about good object-oriented programming. They can certainly teach them how to use the tools, but the pains that object-oriented programming is meant to solve only become obvious when you have to work with a large, multi-developer project over a long duration and with changing requirements &mdash; anything less, and your pains will be subtle feelings of awkwardness instead. Definitely not something you can learn from. <strong>If you have never written an unmaintainable piece of mud, you cannot know how object-oriented programming can keep you from writing one.</strong> </p>
<p>You may now dish out punishments in the comment box below. Have fun.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.nicollet.net/2011/04/object-oriented-for-the-win/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Web Workers Camp 2010</title>
		<link>http://www.nicollet.net/2010/10/web-workers-camp-2010/</link>
		<comments>http://www.nicollet.net/2010/10/web-workers-camp-2010/#comments</comments>
		<pubDate>Fri, 29 Oct 2010 23:39:44 +0000</pubDate>
		<dc:creator>Victor Nicollet</dc:creator>
				<category><![CDATA[Functional]]></category>
		<category><![CDATA[Architecture]]></category>
		<category><![CDATA[BarCamp]]></category>
		<category><![CDATA[Design Patterns]]></category>
		<category><![CDATA[Objective Caml]]></category>
		<category><![CDATA[Slides]]></category>

		<guid isPermaLink="false">http://www.nicollet.net/?p=2119</guid>
		<description><![CDATA[I&#8217;ll be participating in the Web Workers BarCamp in Paris today, discussing topics vaguely similar to those of my previous blog post. My slides are downloadable here [pdf], or you can watch them below: See you around!]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ll be participating in the Web Workers BarCamp in Paris today, discussing topics vaguely similar to those of my previous blog post. My slides are downloadable here [<a href="http://www.nicollet.net/files/barcamp10.pdf">pdf</a>], or you can watch them below:</p>
<div id="__ss_5613020" style="margin: auto; width: 425px;"><object id="__sse5613020" classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="425" height="355" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="allowFullScreen" value="true" /><param name="allowScriptAccess" value="always" /><param name="src" value="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=pres-101029183233-phpapp01&amp;stripped_title=pres-5613020&amp;userName=victornicollet" /><param name="name" value="__sse5613020" /><param name="allowfullscreen" value="true" /><embed id="__sse5613020" type="application/x-shockwave-flash" width="425" height="355" src="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=pres-101029183233-phpapp01&amp;stripped_title=pres-5613020&amp;userName=victornicollet" name="__sse5613020" allowscriptaccess="always" allowfullscreen="true"></embed></object></div>
<p>See you around!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.nicollet.net/2010/10/web-workers-camp-2010/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Interest(ing) rates</title>
		<link>http://www.nicollet.net/2010/01/interesting-rates/</link>
		<comments>http://www.nicollet.net/2010/01/interesting-rates/#comments</comments>
		<pubDate>Wed, 06 Jan 2010 18:35:53 +0000</pubDate>
		<dc:creator>Victor Nicollet</dc:creator>
				<category><![CDATA[Functional]]></category>
		<category><![CDATA[Algorithms]]></category>
		<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Design Patterns]]></category>
		<category><![CDATA[Psychology]]></category>
		<category><![CDATA[Useless]]></category>

		<guid isPermaLink="false">http://www.nicollet.net/?p=1244</guid>
		<description><![CDATA[The most common way of investing money is putting it in a savings account. You lend a fixed amount of money to someone, and they pay interest over that money at a predetermined rate. Let&#8217;s say you lend 1,000 € at an interest rate of 3%, paid every year: at the end of the year, [...]]]></description>
			<content:encoded><![CDATA[<p>The most common way of investing money is putting it in a savings account. You lend a fixed amount of money to someone, and they pay interest over that money at a predetermined rate. Let&#8217;s say you lend 1,000 € at an interest rate of 3%, paid every year: at the end of the year, you would receive 30 € as payment for your lending. You would spend these on fine wine or nice clothes and wait until the next year to get another 30 €, and so on.</p>
<p>Savings accounts work on the basis of <strong>simple interest</strong> : what you get paid is a linear function of both time and money. Lend for half a year? 3% ÷ 2 = 1.5% Lend for two years? 3% ×2 = 6%</p>
<p>An important thing to bear in mind is that interest is paid at fixed intervals, for instance at the beginning of January. You don&#8217;t have to spend those 30 € : you can them on the savings account and earn simple interest on them after a year (3% of 30 € is 0.90 €).</p>
<p>Using this strategy, lending for two years is done at a 6.09% rate instead of 6%, because you get interest on interest. This is known as <strong>compound interest</strong> : what you get paid is an exponential function of time. Lend for two years ? (+3%)² = +6.09% Lend for three years ? (+3%)³ = +9,27%</p>
<p>The mathematical justification is that, with a 3% interest, your total amount of money is multiplied by 1.03 every year:</p>
<p style="padding-left: 30px;">1,000 + 30 = 1,000 + 3% of 1,000 = 1,000 + 0.03 × 1,000 = 1.03 × 1,000</p>
<p>So, after two years, the amount is multiplied by 1.03 two times, and so on.</p>
<p style="padding-left: 30px;">1,060.90 = 1.03 × 1,030 = 1.03 × 1.03 × 1,000</p>
<p>In short, <strong>percentages have a multiplicative effect</strong>.</p>
<p>And now, pop quiz : I&#8217;ve gained +5% weight over the winter holidays. What percentage of my weight do I have to lose to be back to normal ?</p>
<p>If you answered -5%, you missed the point. Multiplicative effect means the total change of weight would be +5% × -5% = 1.05 × 0.95 = 0.9975 = <strong>-0.25%</strong>. I would be losing too much weight !</p>
<p>The correct answer was 1 ÷ 1.05 = <strong>-4.76%</strong>.</p>
<p>Similarly, if the number of graduates of a given school increases by +10% on year one and +25% on year two, the total increase is +37.5% and not +35%.</p>
<h3>Duality</h3>
<p>This is where mathematicians (and computer scientists) use an interesting little concept called duality. Percentages are numbers that are easy to understand, but hard to combine. We can transform them into something that is a little bit harder to understand, but easier to combine.</p>
<p>The traditional way to transform multiplication into addition is to exponentiate, due to an interesting property of the exponential function:</p>
<p style="padding-left: 30px;">exp(a) ×exp(b) = exp(a + b)</p>
<p>So, I wish to find a percentage operator (§) such that:</p>
<ul>
<li>we conserve some values, 0§ = 0% and 100§ = 100%</li>
<li>applying A§, then B§, is equivalent to applying (A+B)§</li>
</ul>
<p>Then this uniquely defines an operator which is called exponential percentage:</p>
<p style="padding-left: 30px;">A§ = B%  ↔  A = 100 × log(1 + B ÷ 100) ÷ log(2)</p>
<p>Some common values:</p>
<table border="0">
<tbody>
<tr>
<td>0% = 0§</td>
<td>+100% = +100§</td>
<td>-100% = -∞§</td>
<td>200% = 158.4§</td>
</tr>
<tr>
<td>+1% = +1.4§</td>
<td>+99% = +99.2§</td>
<td>-1% = -1.4§</td>
<td>-99% = -664§</td>
</tr>
<tr>
<td>+10% = +13.7§</td>
<td>+90% = +92.6§</td>
<td>-10% = -15.2§</td>
<td>-90% = -332§</td>
</tr>
<tr>
<td>+25% = +32.2§</td>
<td>+75% = +80.7§</td>
<td>+50% = +58.4§</td>
<td>-50% = -100§</td>
</tr>
</tbody>
</table>
<p><a href="http://www.nicollet.net/wp-content/uploads/2010/01/percent.png"><img class="aligncenter size-full wp-image-1246" title="percent" src="http://www.nicollet.net/wp-content/uploads/2010/01/percent.png" alt="percent" width="393" height="287" /></a></p>
<p>So, if I gained +5§ weight over the holidays, I can lose -5§ weight and be back to where I started, and if a number increases by 10§, then by 25§, it increases by 35§ overall.</p>
<p>And of course, a yearly interest rate of 4.2§ = 3% compounded over ten years is 42§ = 34%.</p>
<h3>No Free Lunch</h3>
<p>Normal percentage rules make compounding hard, but it&#8217;s reasonably easy to estimate a percentage based on a fraction. Exponential percentage rules make compounding easy, but evaluating a percentage based on real figures is harder.</p>
<p>In practice, compounding happens less often than evaluating, so humans use normal percentage rules. And computers are good at compounding through multiplication, so they don&#8217;t need exponentiation.</p>
<p>Duality does have some other uses, though. For instance, there&#8217;s the duality between two representations of complex numbers:</p>
<p style="padding-left: 30px;">a + <strong>i</strong>b = r exp <strong>i</strong>θ</p>
<p>The cartesian (a,b) notation makes it easier to add numbers, but multiplication is harder:</p>
<p style="padding-left: 30px;">a + <strong>i</strong>b + c + <strong>i</strong>d = (a+c) + <strong>i</strong>(b+d)</p>
<p>The polar (r,θ) notation makes it easier to multiply numbers, but addition is harder:</p>
<p style="padding-left: 30px;">r exp <strong>i</strong>θ × s exp <strong>i</strong>φ = (r × s) exp <strong>i</strong>(θ+φ)</p>
<p>For mathematically-oriented computer scientists, duality is a gold mine, because it lets one reduce a complex problem in one area to a simpler problem in another area (whether simpler means faster, as in the case of <a href="http://en.wikipedia.org/wiki/Fast_Fourier_transform" target="_blank">FFT</a>, or easier to think about)..</p>
<h3>The Law of DSLs</h3>
<p>There&#8217;s one common duality that is fundamental in the computer world: the correspondence between data and code. In a fit of narcissism, let me sit wisely atop a tall mountain to announce <strong>Nicollet&#8217;s Law of Domain Specific Languages</strong>:</p>
<blockquote><p>Any sufficiently complex data processing algorithm is as an interpreter for a small domain-specific language, and the data being processed is a program executed by the interpreter.</p></blockquote>
<p>In some cases, this law only complicates things further. In many cases, however, the different angle it provides leads to many advantages, one of them being to transform a non-programming concept (such as an accounting file format) into a concept programmers are familiar with (a programming language).</p>
<p>A minimalist language design culture is enough to grasp several interesting concepts about executing code, which can be quite handy when processing data:</p>
<h4>1. Compile to Bytecode</h4>
<p>Interpreters don&#8217;t execute a string of characters. They tokenize that string, turn the tokens into an abstract syntax tree representing operations, functions and variables, then turn that syntax tree into a sequence of small, executable operations. That sequence is then fed into a virtual machine (or further compiled to machine code) to perform the actual operations.</p>
<p>If the input data for your algorithm is very complex, you can begin on the other side: what will the algorithm <strong>do</strong> with the data? Will it be inserting the data into a database? Constructing a data object from bits and pieces? What you are looking for is a set of atomic operations you can apply to generate the result. Implement these operations, then start working on a translation algorithm to turn the input data into such operations.</p>
<p>There are several common and friendly representations for such atomic bytecode:</p>
<p><strong>Instruction lists</strong> are executed in order. This is your classic assembler listing, without the jumps. A typical &#8220;parse file and insert into database&#8221; algorithm would generate such an instruction list, and every instruction would be an INSERT, DELETE or UPDATE. Works best when you can read the data and generate the instructions in the right order: if you cannot get the list in the right order from the start, consider another approach.</p>
<p><strong>Dependency graphs</strong> work like makefiles: you have several instruction lists floating around with relationships between them, indicating that one list has to be executed before another. A topological sort of the graph results in a single classic instruction list you can execute. A multi-file import, where some files contain data needed in other files, can be the way to go.</p>
<p><strong>Nested scopes</strong> are the typical extension to instruction lists: every item in a list can be either an instruction, or another list, possibly tagged with some data. This could be a conditional (if this condition is true, execute this list), a loop (though it is best to avoid these) or a context (a &#8220;polygon&#8221; scope contains &#8220;insert vertex&#8221; operations that apply to that polygon). You can even allow variables in a let-in fashion (of which the polygon example above is just a special case) ! Note that nested scopes can be easily represented as XML.</p>
<h4>2. Static Analysis</h4>
<p>A side-effect of compiling to bytecode is that you get to process the entire file before you actually perform the intended operations. This makes a rollback easier if you notice that there&#8217;s an error on the last line of the file: if you make sure that no atomic operation in your target language can fail due to bad input (such as incorrect data values), then you can check your input data for correctness without doing anything to your program state.</p>
<p>Even better, if your compilation process is cheap (linearly traverse a file for parsing) and you have heuristics for predicting how much time and resources your individual instructions require, then you can try to accurately predict the needs of the entire process.</p>
<p>Static analysis also means you can optimize. If, for instance, you&#8217;re inserting data into a database and need to resolve names or keys frequently (such as &#8220;add this item to list #732&#8243;), you can easily construct a table of needed keys (that you can get in one query when the processing starts) using the dependency graph approach.You can also optimize resource allocation by using common register allocation techniques: sort your dependency graph to keep as few resources in memory as possible at any given time.</p>
<h4>3. Caching</h4>
<p>Try to perform most of the processing offline.</p>
<p>For instance, if you frequently &#8220;apply&#8221; one file to another, such as a nearly-constant &#8220;list of categories&#8221; file used to resolve the &#8220;category&#8221; key in a daily object import, you can benefit from compiling the nearly-constant file to an easily loaded, easily applied format.</p>
<p>You see a cached dictionary that maps keys to categories? I see a DSL that allows dictionary literals as part of the language, and a source file that contains a literal mapping keys to categories, with an interpreter that can apply constant propagation to dictionaries.</p>
<p>Another benefit is when applying changes to mission-critical software. Inserting lots of data into a web database can create a heavy load on the server and make the site unavailable to visitors. It might therefore be preferrable to pre-compile the imported data into requests through a process that keeps a light load on the server, then run the requests.</p>
<p>Besides, with proper nested scoping, you can slice an import into several transactions. This keeps the lock count low, allows spreading the transactions over time to reduce the load, and lets you resume the import process if, for some reason, it gets interrupted.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.nicollet.net/2010/01/interesting-rates/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Information Flow</title>
		<link>http://www.nicollet.net/2009/08/information-flow/</link>
		<comments>http://www.nicollet.net/2009/08/information-flow/#comments</comments>
		<pubDate>Mon, 17 Aug 2009 04:32:24 +0000</pubDate>
		<dc:creator>Victor Nicollet</dc:creator>
				<category><![CDATA[Strategy]]></category>
		<category><![CDATA[Design Patterns]]></category>
		<category><![CDATA[Documentation]]></category>
		<category><![CDATA[Learning]]></category>
		<category><![CDATA[Psychology]]></category>

		<guid isPermaLink="false">http://www.nicollet.net/?p=1130</guid>
		<description><![CDATA[The real world is a complex place. When writing software that has to interact with the real world, there are literally thousands of concepts you have to master and tens of thousands of details you have to be aware of, or you will paint yourself into a corner where your software clashes with reality. And [...]]]></description>
			<content:encoded><![CDATA[<p>The real world is a complex place. When writing software that has to interact with the real world, there are literally thousands of concepts you have to master and tens of thousands of details you have to be aware of, or you will paint yourself into a corner where your software clashes with reality. <em>And reality always wins</em>.</p>
<p>Understanding concepts and details is a fundamental part of a project&#8217;s time budget, whether they come from the project requirements, real-world constraints, third party code or teammates. Every time information goes around in a project, it uses up valuable time, and to keep the time budget tight it becomes necessary to decide <em>what</em> information should be allowed to go around, and <em>where</em>.</p>
<p>Working on concurrent systems is an enlightening experience, because of the many similarities between an array of computers and a team of information workers. Computers arrays have latency issues when one thread depends on another thread to be done&#8230;</p>
<blockquote><p>“When do you think your settings import module will be done? I&#8217;m stuck on the payment API until I can load those settings!„</p></blockquote>
<p>&#8230;they have bandwidth issues and manipulating some data yourself is usually faster than sending the data to another part of the cluster for treatment&#8230;</p>
<blockquote><p>“The User object? Well, it&#8217;s a bit of a weird design, but it&#8217;s rather clever. I&#8217;ll draw you a quick UML sketch on the blackboard so you can see what the five helper classes do.„</p></blockquote>
<p>&#8230;they have to avoid data loss if a computer or network is down&#8230;</p>
<blockquote><p>“I have no idea how this stored procedure works, you should ask Tim, he&#8217;s the one who wrote it. He&#8217;s in southern France right now but I think he&#8217;ll be back next month.„</p></blockquote>
<p>&#8230; and they have to handle a directory of parts and a garbage collector for data&#8230;</p>
<blockquote><p>“Wait, nobody&#8217;s written the comment moderation back-office! Who was in charge of doing it? Who wrote the comments front-end anyway?„</p></blockquote>
<p>There are algorithms, strategies and techniques for handling and optimizing those things. Many of these can be adapted to humans, with the added benefit that, humans being smart, they can understand the point of those algorithms and compensate for minor flaws if the plan isn&#8217;t perfect.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.nicollet.net/2009/08/information-flow/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Destroy After Use</title>
		<link>http://www.nicollet.net/2008/12/destroy-after-use/</link>
		<comments>http://www.nicollet.net/2008/12/destroy-after-use/#comments</comments>
		<pubDate>Mon, 22 Dec 2008 23:00:13 +0000</pubDate>
		<dc:creator>Victor Nicollet</dc:creator>
				<category><![CDATA[Functional]]></category>
		<category><![CDATA[Design Patterns]]></category>
		<category><![CDATA[Objective Caml]]></category>

		<guid isPermaLink="false">http://r17474.ovh.net/?p=117</guid>
		<description><![CDATA[Exceptions make the control flow of a program quite complex, since any call could possibly create an exception and thus leave the currently executing function. Even with a garbage collector, certain resources (such as files) have to be manually released. Some languages use destructors and RAII (for instance, C++) to handle scope-based release, others use [...]]]></description>
			<content:encoded><![CDATA[<p>Exceptions make the control flow of a program quite complex, since any call could possibly create an exception and thus leave the currently executing function. Even with a garbage collector, certain resources (such as files) have to be manually released. Some languages use destructors and RAII (for instance, C++) to handle scope-based release, others use an explicit using(){} or finally {} block to also add a scope to such resources.</p>
<p>None of this exists in OCaml.</p>
<p>It can be rather easily reconstructed using existing language semantics, however:</p>
<blockquote>
<pre style="background: #ffffff none repeat scroll 0% 50%; color: #000000;"><span style="color: #000084; font-weight: bold;">let</span> scoped user resource clean =
  <span style="color: #000084; font-weight: bold;">try</span>
    <span style="color: #000084; font-weight: bold;">let </span>result = user resource <span style="color: #000084; font-weight: bold;">in</span>
    clean resource ;
    result
  <span style="color: #000084; font-weight: bold;">with</span> exn -&gt;
    clean resource ;
    <span style="color: #000084; font-weight: bold;">raise</span> exn 

<span style="color: #000084; font-weight: bold;">let</span> with_input name f =
  scoped f (open_in_bin name) close_in 

<span style="color: #000084; font-weight: bold;">let</span> with_output name f =
  scoped f (open_out_bin name) close_out</pre>
</blockquote>
]]></content:encoded>
			<wfw:commentRss>http://www.nicollet.net/2008/12/destroy-after-use/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Missing Link</title>
		<link>http://www.nicollet.net/2008/12/the-missing-link/</link>
		<comments>http://www.nicollet.net/2008/12/the-missing-link/#comments</comments>
		<pubDate>Thu, 04 Dec 2008 23:00:06 +0000</pubDate>
		<dc:creator>Victor Nicollet</dc:creator>
				<category><![CDATA[Imperative]]></category>
		<category><![CDATA[Design Patterns]]></category>

		<guid isPermaLink="false">http://r17474.ovh.net/?p=63</guid>
		<description><![CDATA[In paleontology, a missing link is a transitional fossil that fills the evolutionary gap between two recognized species, both in time and in phenotype. Quite frequently in the course of software development, certain parts of a program are deemed global at some point in time: there never ought to be a need for more than [...]]]></description>
			<content:encoded><![CDATA[<p>In paleontology, a missing link is a <a href="http://en.wikipedia.org/wiki/Transitional_fossil" target="_blank">transitional fossil</a> that fills the evolutionary gap between two recognized species, both in time and in phenotype.</p>
<p>Quite frequently in the course of software development, certain parts of a program are deemed global at some point in time: there never ought to be a need for more than one instance, nor the need for polymorphic handling of a similar yet distinct part. And sometimes, but not always, this initial impression is proven wrong, and the need for more than one instance arises. The consequences of such an error are usually quite extreme: a one-of global entity tends to be referenced by name, requiring a refactoring effort to convert it to a parameter everywhere this is needed. And so, developers are given the choice between going with the initial impression (with the risk of having to convert it) or ignore the initial impression (and spend time on something they might never need).</p>
<p>I propose here a missing link, a third alternative which implements a one-instance form with minimal effort, yet can be easily refactored into a several-instance if the need arises.</p>
<p>Of course, this applies only in situations where a doubt exists. If it is obvious right away that a given entity will be a several-instance entity, then it should be written as such from the beginning. Only in situations where the current state of the design (especially in Agile methodologies) only requires a single instance should this missing link be considered.</p>
<p>But first, a few background information about this.</p>
<h2>Procedural and Objective</h2>
<p>A while ago, I wrote a short <a href="http://nicollet.net/blog/cpp/32-interfaces" target="_blank">article</a> that discussed how procedural programming approached the issue of global state. The basic conclusions were that by grouping global state into modules, the number of possible interactions decreased, and those that occurred were made obvious by the structure of the code. As such, a procedural program is a set of stateful entities that interact with each other along a dependency graph.</p>
<p>So is an object-oriented program: stateful objects reference each other and communicate through message-passing. At runtime, there is therefore no major difference in this respect between an object-oriented program and a non-objective procedural program. The main difference is elsewhere. Objects, unlike modules, are rarely defined in the code: they are usually created dynamically at runtime. So, while the two dependency graphs may look the same, the object-oriented runtime dependency graph is a purely dynamic construct that can be altered at will, whereas the procedural runtime dependency graph is a static construct that is tightly mapped to the structure of the code.</p>
<p><img class="alignright" style="border: 0pt none; margin: 5px;" src="/images/the-missing-link.jpg" border="0" alt="Keep It Simple, Bridge" width="425" height="640" align="right" />The extensibility of object-oriented programming comes from the dynamic aspects of its dependency graph: if objects can be created, replaced and removed at will, even by the program itself, then the program will have a flexibility and adaptability that exceeds the static module skeleton of a traditional procedural program. On the other hand, the static module skeleton is easier to design and implement than a fully flexible object architecture where elements can be hotplugged—even with language support for object-oriented programming.</p>
<p><em>When you have to build a bridge, you don&#8217;t build a fully automated moving bridge just for the fun of it: if the simplest thing that could possibly work is a plain old suspended bridge with no moving parts, then this is what you do. </em></p>
<p>Remember that we&#8217;re considering here the case of a one-instance object being converted to a multiple-instance object even though there is no apparent need for it yet. Why is that more difficult than using the one instance directly?</p>
<ul>
<li>Polymorphism requires abstraction, and abstraction usually comes from having several instances—then you can identify what&#8217;s general and what&#8217;s specific. Trying to abstract a concept from a single instance will miss some general aspects that could have been useful, and keep some specific aspects that will limit extensibility.</li>
</ul>
<ul>
<li>It&#8217;s easier to program something which makes sense. If having multiple instances does not make sense yet, then the programmer cannot rely on his instinct or his intuition to determine whether what he&#8217;s writing works or not, and the single-instance assumption might fight its way stealthily into the program anyway. Testing a multi-instance system when you only have one instance is quite difficult, too.</li>
</ul>
<ul>
<li>When using multiple instances, one has to consider what scope they should exist in, and how they are propagated down. Of course, there&#8217;s the code required to have the multiple instances trickle down to where they are used. But the real difficulty is deciding what the correct scope is—and since there isn&#8217;t a reason yet to consider multiple non-global instances, it&#8217;s quite probable that such a decision will turn out to be wrong in the end anyway.</li>
</ul>
<ul>
<li>Handling a flexible design is harder, mentally, than handling a fixed design. In a static module skeleton, the dependencies are clear and, when they are not, can be extracted from the code very easily. In a dynamic dependency graph, one has to think about how that graph will be constructed at runtime before they know whether a certain piece of code can affect a certain concept (especially if that concept is still a one-instance global).</li>
</ul>
<p>As such, delaying the setup of a flexible and dynamic dependency graph as long as possible reduces the amount and difficulty of development efforts. The downside, of course, is the refactoring effort required if the initial assumption is proven wrong. A solution to this conundrum would keep the <em>conceptual </em>simplicity of a single-instance static skeleton, while having a <em>code </em>layout that can be easily changed to a multi-instance dynamic graph as soon as the reasons for such a change are discovered.</p>
<h2>Singletons?</h2>
<p>While often used to implement single-instance cases, the <a href="http://nicollet.net/blog/cpp/58-singletons" target="_blank">Singleton design pattern</a> does not solve this issue: it is a pattern for representing the single instance, and the problem here is related to how the rest of the program refers to the instance. If all your code uses a certain instance by name, it does not matter whether that instance is a global variable or the instance of a singleton class: that code cannot use a second instance until it uses that instance by reference, and that reference is a parameter of some kind.</p>
<p>So, the choice of a singleton (or any other approach) to represent the instance is completely independent of any solution that could be found to this problem.</p>
<h2>Bottom-Up Construction</h2>
<p>An approach which can be used in many cases and leads to avoidance of single instances is bottom-up development. Since the system is being built bottom-up, most of the time when a module would need to access a single instance, that single instance has not yet been written or even designed yet. In that situation, the correct approach is to create an interface representing what the module expects, and have a parameter somewhere to receive an instance implementing that interface. Then, when the single instance is finally designed, it can be made to implement that interface (or <a href="http://www.dofactory.com/Patterns/PatternAdapter.aspx" target="_blank">adapt</a> to it) and then passed as an argument to the relevant elements.</p>
<p>Bottom-up construction eliminates many of the problems related to handling single instances. That is, by refusing to assume anything about the object being used beyond its interface (including, of course, how many instances exist that implement that interface) development automatically deviates from the single-instance approach and can handle multiple instances automatically.</p>
<p>This does not solve everything, however. Bottom-up construction is notoriously ill adapted to fast development cycles, because the actual functionality of the program appears only when the top-level objects have been developed, which is at the end of the cycle. By contrast, with a top-down approach, the top-level object can be developed first and filled with mocks, and the mocks replaced with the actual functionality on the fly. Besides, bottom-up construction is also subject to violating <a href="http://c2.com/cgi/wiki?YouArentGonnaNeedIt" target="_blank">YAGNI</a>, because objects are created before their users exist (and may end up not existing at all).</p>
<h2>The Missing Link</h2>
<p>The key of the problem here is that directly converting a single-instance approach to a multiple-instance approach requires a lot of argument-passing: since the instances are manipulated through references, and those references are parameters, a lot of parameters are required to pass the instance from its point of creation (quite probably somewhere near the entry point) down to every part of the program that uses it. A possible solution would be do to as much of that work as possible (replacing as many of the global accesses as possible with references) without having to heavily increase the number of parameters.</p>
<p>Aside from global symbols, one other kind of value is available in functions without being passed as a parameter: <em>member variables</em>.</p>
<blockquote>
<h2>The Screen Design Pattern</h2>
<p><strong>The problem:</strong> methods of a class needs to access a global variable directly. Making that variable non-global is unnecessary right now, but might be necessary in the future.</p>
<p><strong>The solution:</strong> have the class keep a <em>screen</em> member reference to the global variable. That reference is initialized in the constructor and used everywhere else. Member functions other than the constructor are never allowed to reference global variables.</p></blockquote>
<p>What would be an example of refactoring using this pattern? Consider the original C# code:</p>
<blockquote>
<pre style="background: #ffffff none repeat scroll 0% 50%; color: #000000;"><span style="color: #000084; font-weight: bold;">public</span> <span style="color: #000084; font-weight: bold;">class</span> GlobalUser
{
  <span style="color: #000084; font-weight: bold;">public</span> GlobalUser() {}
  <span style="color: #000084; font-weight: bold;">public</span> <span style="color: #000084; font-weight: bold;">void</span> MethodA() { Global.Instance.Use(); }
  <span style="color: #000084; font-weight: bold;">public</span> <span style="color: #000084; font-weight: bold;">void</span> MethodB() { Global.Instance.Use(); }
}</pre>
</blockquote>
<p>Methods in this code use a global instance. Let&#8217;s replace that with a member reference to the global instance:</p>
<blockquote>
<pre style="background: #ffffff none repeat scroll 0% 50%; color: #000000;"><span style="color: #000084; font-weight: bold;">public</span> <span style="color: #000084; font-weight: bold;">class</span> GlobalUser
{
  <span style="color: #000084; font-weight: bold;">private</span> Global screen;
  <span style="color: #000084; font-weight: bold;">public</span> GlobalUser() { <span style="color: #000084; font-weight: bold;">this</span>.screen = Global.Instance; }
  <span style="color: #000084; font-weight: bold;">public</span> <span style="color: #000084; font-weight: bold;">void</span> MethodA() { <span style="color: #000084; font-weight: bold;">this</span>.screen.Use(); }
  <span style="color: #000084; font-weight: bold;">public</span> <span style="color: #000084; font-weight: bold;">void</span> MethodB() { <span style="color: #000084; font-weight: bold;">this</span>.screen.Use(); }
}</pre>
</blockquote>
<p>Using the pattern, the refactored code can be written straight away (instead of writing the original code and then refactoring it) without requiring mental effort, which means that using the pattern bears almost no cost (it only requires an additional member variable and an additional assignment in the constructor). And if the screened variable suddenly has to become non-global, only the constructor will be affected, saving precious time that would have been lost refactoring the methods as well.</p>
<p>Note that creating a property which returns the global variable (instead of a member variable assigned in the constructor) is equivalent in terms of writing the initial refactored code (the <span style="font-family: courier new,courier;">this.screen = Global.Instance</span> is replaced with <span style="font-family: courier new,courier;">{ get { return Global.Instance; } }</span>), but does not have the same benefits when refactoring to remove the global instance because a connection will have to be made between the value returned by the property and the parameter received in the constructor, which requires more code. I therefore suggest using a member variable assigned in the constructor instead of a property.</p>
<p>Also note that the above can be seen as the preliminary step of a <a href="http://en.wikipedia.org/wiki/Dependency_injection" target="_blank">dependency injection</a> and follows the same general idea as preparing your application for <a href="http://www.springframework.net/doc-latest/reference/html/quickstarts.html" target="_blank">Spring.NET</a> integration (for example). The difference is that the pattern does so at the class scale, and emphasizes development speed (using the pattern should be no slower than using the non-pattern approach) rather than external interoperability and flexibility.</p>
<h2>Memory issues</h2>
<p>Of course, storing a screen reference in every object might not be welcome. Some very small objects, such as vectors or polygons, cannot afford to store an additional reference because of the time and space overhead this would imply.</p>
<p>My suggestion in these situations is as follows: <em>eliminate the global dependency altogether</em>. Resist the temptation to have a certain small object <em>frobnicate </em>itself using a global instance, and instead have a larger object <em>frobnicate </em>the small objects. This makes the optimization more transparent (the small objects are now just bits of data without any kind of polymorphic behavior that are externally manipulated) and allows using the Screen design pattern on the larger object.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.nicollet.net/2008/12/the-missing-link/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Window Tax</title>
		<link>http://www.nicollet.net/2008/11/the-window-tax/</link>
		<comments>http://www.nicollet.net/2008/11/the-window-tax/#comments</comments>
		<pubDate>Thu, 27 Nov 2008 23:00:22 +0000</pubDate>
		<dc:creator>Victor Nicollet</dc:creator>
				<category><![CDATA[Imperative]]></category>
		<category><![CDATA[Design Patterns]]></category>

		<guid isPermaLink="false">http://r17474.ovh.net/?p=61</guid>
		<description><![CDATA[In 1798, the French Directoire instituted a new tax, called &#8220;la taxe sur les portes et fenêtres&#8221; (quite literally, the tax on doors and windows). Back then, the young republic did not yet have the means to closely examine the income and possessions of all taxpayers in order to determine how much each of them [...]]]></description>
			<content:encoded><![CDATA[<p>In 1798, the French <em>Directoire </em>instituted a new tax, called &#8220;<em>la taxe sur les portes et fenêtres</em>&#8221; (quite literally, the tax on doors and windows). Back then, the young republic did not yet have the means to closely examine the income and possessions of all taxpayers in order to determine how much each of them should pay, yet it wanted those who could afford to pay more to actually pay more. Doors and windows were an obvious sign of wealth that could be easily evaluated by the tax collector without having to enter the house, and since the taxpayers often did not have access to estate ownership titles having the tax be paid by the occupant helped things significantly.</p>
<p><img class="alignright" style="border: 0pt none; margin: 5px;" title="Neon sign tax?" src="/images/the-window-tax.jpg" border="0" alt="Neon sign tax?" width="425" height="640" align="right" />Of course, soon after the tax appeared, people reduced the number of windows and doors, sometimes going as far as wall every doors but one, and brick-up all the windows.</p>
<p>A similar thing happened in France in the 1990s: until then, insurance for work accidents was provided equally to all companies, and all companies paid the same flat rate based on the national accident rate. In an effort to make large companies pay for improved safety measures (because they had the funds for it), the insurance rate became deregulated for companies above a certain size, so that a large unsafe company paid more than if it was safe. Small companies still paid a national rate.</p>
<p>Unlike what was expected, most large companies spent no money on safety and instead outsourced dangerous jobs to small companies, thereby improving their own rates. And since small companies often had worse safety than large companies, the overall workplace safety slightly decreased as a consequence of this deregulation.</p>
<p>The same happens in the computer world. When your log software advertises a thousand visitors per day, it simply hides the fact that HTTP has no notion of visitor:</p>
<ul>
<li>A given person may use several computers and IP addresses to visit the same site.</li>
<li>A given person may use a proxy cache to read the website without having to connect to it.</li>
<li>The same IP address may be shared by several people (by ISP shuffling, on public computers and on LANs).</li>
</ul>
<p>Similarly, when your download client displays a download speed, it&#8217;s quite probably an average over the last few seconds, meaning that if you are suddenly disconnected from the network, the download speed will remain above zero for a short while.</p>
<p>Of course, every user of the HTTP protocol knows, deep down, that HTTP has no notion of visitor and therefore a visitor is just a set of requests from the same IP over a given duration, just like every user of a download client can understand that download speed is an approximation and isn&#8217;t really above zero when there is no network connection.</p>
<p>Yet, it&#8217;s quite easy to forget.</p>
<p>An example upon which I&#8217;ve stumbled recently was a paying web site. Paying users were allowed access to some functionalities of the website, so there was a test performed on every of these pages to see if the user account was associated to valid payment information before displaying the page. Nothing too fancy, a mere <span style="font-family: courier new,courier;">if(user.Subscribed)</span> test that didn&#8217;t deserve factoring out. Then, one day, marketing suggested adding a temporary test account that could be used for a limited duration to allow using those functionalities.</p>
<p>The obvious problem is that the previous assumption &#8220;only paying users can read this page&#8221; that was spread all over the system became incorrect. Cheating by changing the definition of <span style="font-family: courier new,courier;">user.Subscribed</span> was impossible, because subscribed users have valid payment information but test accounts don&#8217;t. So, a new test had to be added to every page: <span style="font-family: courier new,courier;">if(user.Subscribed || user.TestAccount)</span>. Over a few months, the test became increasingly large as new properties were added for access.</p>
<p>Luckily enough, a sane developer came around and refactored the thing as an <span style="font-family: courier new,courier;">if(Functionality.IsAllowed(user))</span> function which allowed changing the access restrictions for these pages in a single place, and developers from that team learned not to make explicit assumptions within the code, not even on extremely simple tests.</p>
<h2>The Window Tax Antipattern</h2>
<p>The Window Tax Antipattern can appear when:</p>
<ul>
<li>A given module provides a piece of information, <strong>Provided</strong>.</li>
<li>A rule (that is neither a consequence of the definition of that module nor an absolute truth) describes how another piece of information, <strong>Deduced</strong>, can be deduced from <strong>Provided</strong>.</li>
<li>Other modules use <strong>Deduced</strong> by manually deducing it from <strong>Provided</strong>.</li>
</ul>
<p>In such a situation, if the rule changes, the deduction becomes incorrect and every module which uses the deduced information will have to be changed as well. As an additional issue, since the deduction is not explicit, finding every module that makes that deduction is often difficult. The solution is to identify the rule, and wrap it in a function that constructs Deduced from Provided (thereby allowing easier refactoring should the rule change).</p>
<p>In our above examples:</p>
<ul>
<li>The provided information is the number of doors and windows, and the rule states that wealth can be deduced from that number.</li>
<li>The provided information is the accident rate, and the rule states that the safety of workers can be deduced from that rate.</li>
<li>The provided information is whether the user has a paying account, and the rule states that the page can be viewed only by users with paying accounts.</li>
</ul>
<p>In some cases, the antipattern does not apply. For instance, if the rule is a consequence of the definition of the original module, then the rule cannot change unless the underlying module also changes (which is an acceptable reason for refactoring). So, if the rule is &#8220;when you push an element onto a stack, the stack is not empty until you pop it&#8221;, then assuming that a stack onto which you just pushed an element is not empty is not a problem (until, of course, either you or another part of the program pops that element).</p>
<p>However, every single business rule (those that can be changed by management, marketing, clients or even laws) is potentially subject to this antipattern unless cleanly wrapped.</p>
<p>In an environment that is heavy on business rules, the Adapter pattern becomes of particular interest. Since every object usually cares about information that is not directly contained in another object business rules crystallize as adapters around objects, providing the information that the current object requires (for instance, &#8220;can this user view this page?&#8221;) based on the information that the other object and its environment provides (for instance, &#8220;is this a test or paying account?&#8221;). View with suspicion every direct interaction between two business objects, and never hesitate to insert adapters or explicit rules whenever the used information is not a direct, obvious and inevitable consequence the provided information.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.nicollet.net/2008/11/the-window-tax/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

