<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Nicollet.Net &#187; Imperative</title>
	<atom:link href="http://www.nicollet.net/chiasma/imperative/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.nicollet.net</link>
	<description>Everyone Loves Me</description>
	<lastBuildDate>Mon, 23 Jan 2012 16:55:59 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=</generator>
		<item>
		<title>Having a Strong Opinion</title>
		<link>http://www.nicollet.net/2011/09/having-a-strong-opinion/</link>
		<comments>http://www.nicollet.net/2011/09/having-a-strong-opinion/#comments</comments>
		<pubDate>Thu, 01 Sep 2011 12:14:21 +0000</pubDate>
		<dc:creator>Victor Nicollet</dc:creator>
				<category><![CDATA[Imperative]]></category>
		<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Psychology]]></category>
		<category><![CDATA[Strategy]]></category>

		<guid isPermaLink="false">http://www.nicollet.net/?p=2534</guid>
		<description><![CDATA[Many blogs about technical hiring will at one point state something about buzzwords and programmer flexibility. One of the original trendsetters, Joel Spolsky, said: The recruiters-who-use-grep, by the way, are ridiculed here, and for good reason. I have never met anyone who can do Scheme, Haskell, and C pointers who can&#8217;t pick up Java in [...]]]></description>
			<content:encoded><![CDATA[<p><img class="aligncenter size-full wp-image-2535" title="sunset" src="http://www.nicollet.net/wp-content/uploads/2011/09/sunset.png" alt="" width="675" height="100" /></p>
<p>Many blogs about technical hiring will at one point state something about buzzwords and programmer flexibility. One of the original trendsetters, <a href="http://www.joelonsoftware.com/articles/ThePerilsofJavaSchools.html" target="_blank">Joel Spolsky</a>, said:</p>
<blockquote><p>The recruiters-who-use-grep, by the way, are ridiculed here, and for  good reason. I have never met anyone who can do Scheme, Haskell, and C  pointers who can&#8217;t pick up Java in two days, and create better Java code  than people with five years of experience in Java, but try explaining  that to the average HR drone.</p></blockquote>
<p>And this is not only a point about elite languages like Scheme-Haskell-C versus mundane languages like Java-PHP-whatever : flexibility, the ability to switch languages and to adapt to new interfaces and libraries, is almost always presented as a prerequisite to being competent. John can perform miracles with PHP but cannot easily learn Ruby ? Then John is not a competent programmer, he is just a competent <em>PHP</em> programmer.</p>
<p>Maybe there is some truth to this characterization. Maybe there is indeed something about good programmers that lets them shine in a language-independent way, with languages as mere details of their day-to-day miracles. But I am vaguely uncomfortable with that notion. And not for personal reasons — my current language of choice is one of those elite functional languages that would hypothetically place me at the apex of the competence food chain.</p>
<p>I believe the critical element of programming competence is not <em>ability</em> but <em>passion</em>. What makes you a good programmer is how much you care about software development. Does John have a nine-to-five PHP programming job and hardly touch the computer outside of work, or does he do small projects on the side, or contribute to Open Source PHP software, or answer technical PHP questions on Stack Overflow, or perform any other number of PHP-related activities that do not have professional rewards as their main objective? Does he unconsciously try to <em>do the right thing</em> in his code, even though it will be harder than writing a dirty hack to make his boss happy?</p>
<p>I have seen people, many of them high-ranking academics, with the intellectual firepower to outgun me in any programming-related endeavor, but a striking lack of passion that let their applications crippled, hideous and unreliable. And I have no doubts that, had they cared about those things, they could have done better.</p>
<p>I have seen people, many of them students, with a genuine passion for software development, who would spend their free time hacking together video games or dynamic websites or clever hacks, who would notice after a while that their abilities were stagnating and, unable to improve, would give up programming rather than live with the frustration of writing software worthy of their expectations.</p>
<p>And when you care about programming, you tend to have strong opinions about how it should be done.</p>
<p>Some of these opinions are trivial. My hair stands on end whenever I have to read badly formatted code — I don&#8217;t care about the opening-brace-position flame wars, any convention is fine by me as long as it is consistently followed — and the authors often wonder why I would care about such a silly thing. I have a strong opinion about how code should look like, and I dislike working with people who do not share that opinion.</p>
<p>Yes, I am one of those Scheme-Haskell-C elite programmers, and I can pick up Java in a few days and outperform experienced Java-only programmers. I have done it several times in the past. And every single time I did so, I felt dirty and miserable, because Java goes against several of my opinions about what software development should be like.</p>
<p>In fact, I am not really surprised about the popular success of Python and Ruby on Rails — not in terms of how many projects are written, but in terms of how outspoken the technical advocates are. This is because those two have something that appeals to people who can become passionate about them : a clean core philosophy you can agree or disagree with.</p>
<p>Python zealots flock around the <a href="http://www.python.org/dev/peps/pep-0020/" target="_blank">Zen of Python</a> :</p>
<blockquote><p>Beautiful is better than ugly.<br />
Explicit is better than implicit.<br />
Simple is better than complex.<br />
Complex is better than complicated.<br />
Flat is better than nested.<br />
Sparse is better than dense.<br />
Readability counts.<br />
Special cases aren&#8217;t special enough to break the rules.<br />
Although practicality beats purity.<br />
Errors should never pass silently.<br />
Unless explicitly silenced.<br />
In the face of ambiguity, refuse the temptation to guess.<br />
There should be one&#8211; and preferably only one &#8211;obvious way to do it.<br />
Although that way may not be obvious at first unless you&#8217;re Dutch.<br />
Now is better than never.<br />
Although never is often better than *right* now.<br />
If the implementation is hard to explain, it&#8217;s a bad idea.<br />
If the implementation is easy to explain, it may be a good idea.<br />
Namespaces are one honking great idea &#8212; let&#8217;s do more of those!</p></blockquote>
<p>Ruby on Rails fanboys have a similar set of core beliefs, the <a href="http://guides.rubyonrails.org/getting_started.html#what-is-rails" target="_blank">Rails Way</a>:</p>
<blockquote><p>DRY – “Don’t Repeat Yourself” – suggests that writing the same code over and over again is a bad thing.<br />
Convention Over Configuration – means that Rails makes assumptions about what you want to do and how you’re going to d o it, rather than requiring you to specify every little thing through endless configuration files.<br />
REST is the best pattern for web applications – organizing your application around resources and standard HTTP verbs is the fastest way to go.</p></blockquote>
<p>So, if you happen to wholeheartedly agree with the Ruby on Rails way, then by using it you are certain to find both a technical environment in which you can feel happy, and a community that shares you strong opinions about software development. It is any wonder, then, that people <em>passionate</em> about the RoR values would flock to RoR and, inevitably, start advocating its use?</p>
<p>Going a little bit further, if you are hiring for your software company, would you rather hire someone with weak opinions on most topics because they are «flexible» or someone with strong opinions that match the strong opinions of your company? Given the choice, I would certainly hire the latter.</p>
<p>I have my own «core philosophy» that I apply to the way I write my own code. These would be, by order of decreasing importance:</p>
<ol>
<li>It is better to <strong>have a correct program with few features</strong>, than a buggy program with many features.<br />
<small>If possible, take the time to design your code and your interface so that errors cannot happen. If not, explicitly detect and display all errors as they happen. If possible, have a programming language and a programming style that can eliminate by design many errors, rather than a programming language or programming style that improves productivity at the cost of having more errors.<br />
</small></li>
<li>It is better to <strong>prove the correctness of a program</strong>, than to test for the existence of bugs.<br />
<small>Tests cannot prove that the software is correct, they may only prove the existence of bugs. A proven program contains no bugs, there is no worry about having enough code coverage and enough test cases. This is a special case of &#8220;fail early&#8221; : better to fail at the compilation stage, than to fail during tests or at runtime.</small></li>
<li>It is better to <strong>accept that code will have to be rewritten</strong>, than to future-proof a complex design.<br />
<small>Future-proof code will likely be larger, and contains more untested pieces, than normal code. This increases the probability of bugs, without completely eliminating the possibility of a completely unforeseen design change that still involves a rewrite. Preparing your code for a rewrite, by splitting it up into clean independent self-documenting modules and creating automated correctness checks for these, is the best way to make it flexible.<br />
</small></li>
<li>It is better to <strong>enforce data constraints through types</strong>, than to enforce it through code.<br />
<small>Attempting to store data that violates the constraints fails earlier if the type cannot represent that data, especially in a statically typed language. Doing things this way might take longer than just keeping a flexible data type and performing the constraint checks in the code, but the odds of it being correct are higher.<br />
</small></li>
<li>It is better to <strong>have the computer do work for you</strong>, than for you to do that work yourself.<br />
<small>Why write trivial unit tests when you can harness the type system to perform those checks? Why define or configure things by hand when your framework could define or configure them for you?</small></li>
<li>It is better to <strong>rewrite your code using new concepts</strong>, than to insist on using existing but ill-adapted concepts.<br />
<small>Concepts improve productivity and readability, and by design will prevent some kinds of incorrect usage, but only as long as they match what the software is expected to be doing. Otherwise, at best they will be a useless weight and at worst will have to be tediously worked around to achieve anything. The size of the refactoring is no obstacle: if half the application needs to be adapted to the new concept, then so be it.<br />
</small></li>
<li>It is better to <strong>repeat yourself from time to time</strong>, than to introduce too many concepts.<br />
<small>Any repetition can be eliminated by adding a new abstraction through refactoring. That abstraction is usually a mere application of an existing pattern or concept, but might sometimes give flesh to a new concept. While that concept arguably already existed in the non-refactored code, it is easier to understand uncommon concepts by looking at their repeated code, than to give them a sufficiently understandable name.</small></li>
</ol>
<p>I don&#8217;t know. Maybe someone might agree with me one day.<br />
<small>Article image © Timo Newton-Syms — <a href="http://www.flickr.com/photos/timo_w2s/6021716943/in/photostream/">Flickr</a></small></p>
]]></content:encoded>
			<wfw:commentRss>http://www.nicollet.net/2011/09/having-a-strong-opinion/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Get Rid of Constraints</title>
		<link>http://www.nicollet.net/2011/06/get-rid-of-constraints/</link>
		<comments>http://www.nicollet.net/2011/06/get-rid-of-constraints/#comments</comments>
		<pubDate>Wed, 22 Jun 2011 12:34:37 +0000</pubDate>
		<dc:creator>Victor Nicollet</dc:creator>
				<category><![CDATA[Imperative]]></category>
		<category><![CDATA[Architecture]]></category>

		<guid isPermaLink="false">http://www.nicollet.net/?p=2416</guid>
		<description><![CDATA[We add constraints (UNIQUE, FOREIGN KEY&#8230;) to our databases in order to achieve a tradeoff between how easy it is to write data to the database, and how easy it is to read data back. Constraints make writes harder because the application must ensure that those constraints are verified, and react appropriately when they are [...]]]></description>
			<content:encoded><![CDATA[<p><img class="aligncenter size-full wp-image-2417" title="chains" src="http://www.nicollet.net/wp-content/uploads/2011/06/chains.png" alt="" width="675" height="100" /></p>
<p>We add constraints (UNIQUE, FOREIGN KEY&#8230;) to our databases in order to achieve a tradeoff between how easy it is to write data to the database, and how easy it is to read data back. Constraints make writes harder because the application must ensure that those constraints are verified, and react appropriately when they are not. They make reads easier because they allow the application to make assumptions about the data it receives.</p>
<p>Sometimes, the benefits to data reading are not sufficient to offset the increased difficulty of writing it. Let me provide an example.</p>
<p>In a project management tool, users on the free plan are only allowed to have one project. Premium users can have any number of projects. An obvious solution would be to constrain the database so that a given  user can only have one free project at once. This would guarantee that any attempt to connect a second free project to an user would fail.</p>
<p>However, the possibility of non-technical failure of &#8220;connect a project to an user&#8221; will have to be handled in a variety of unusual locations:</p>
<ul>
<li>Deleting project A, creating project B, and cancelling the deletion of project A would cause an error.</li>
<li>Merging two free user accounts would be interrupted because both already have a free project.</li>
<li>Downgrading from premium to free would break if the premium account had more than one project.</li>
</ul>
<p>The good news is that your database will stop you from doing anything that would break the constraint, so you do not have to worry about a glitch letting free users have two projects. The bad news is that you now need to handle that failure specifically in three different locations in your interface: cancelling a deletion, merging two accounts, and an account downgrade that might happen when the user isn&#8217;t even connected. It&#8217;s possible to handle all three, but it&#8217;s quite repetitive, and the user interface for handling these is probably going to be different in each case.</p>
<h4><em>A Posteriori</em> Constraints</h4>
<p>A better solution would be to allow users to have any number of projects <em>in the database</em>, but detect free users that have more than one project (because of cancelling a deletion, merging two accounts, or not renewing their premium subscription) and force them to take corrective action through the user interface. For instance, users with inconsistent accounts are greeted with a list of their projects and the possibility to keep one of them unlocked. The other projects are then locked and can only be unlocked by deleting the currently unlocked project or paying for a premium subscription again. You can even provide some leeway by letting them keep full access to all projects for a week before forcing locks on them, or let customer support temporarily unlock projects in cases of dire need.</p>
<p>The beautiful aspect of this is that the code required to solve the inconsistency is not repeated. Regardless of how you end up with more than one free project, the feature that corrects this is always the same. This means the application is smaller, which leads to fewer bugs and a more consistent user interface.</p>
<p>Of course, if non-subscribing users already have a project and try to create another, they are told to subscribe. Lack of constraints in the database does not mean there are no constraints or clever up-sells in the interface. While RunOrg does prevent users from exceeding the storage ceiling by uploading new documents, all other ways of breaking past the ceiling (such as un-deleting documents or lowering the ceiling) remain unaffected, and our administrator team only takes corrective action when there is a clear abuse of the software — you can stay a few megabytes <em>above</em> the ceiling for as long as you wish. As for the limit on how many users can join a community, there are even fewer restrictions — it would be quite harsh to prevent the 2011-2012 members from joining only because the 2010-2011 members have not left yet — so we ask our customers to return below those ceilings as fast as possible, and only do it ourselves if there is abuse.</p>
<p>There is a conceptual shift here from <strong>No Inconsistent Data In The Database</strong> to <strong>Detect And Correct Inconsistent Data</strong>. This step can be a bit difficult to stomach, especially if you are used to letting your database keep your data clean for you, but for many constraints the shift is actually worth it.</p>
<p>In the early design phase, we moved RunOrg from a standard MySQL architecture over to CouchDB, a NoSQL solution that provides absolutely no constraints beyond having a unique primary key. There are currently no foreign key constraints in the system, nor are there any unique constraints beyond enforcing many-to-many relationships (and these are fundamentally primary keys). This seemed like a huge challenge at first, but the system has grown to support complex data structures and business rules that go beyond standard document-based CRUD, simply by accepting that pieces of data <em>can</em> be missing or be duplicated, and applying an <em>a posteriori</em> solution, possibly involving user intervention, has worked for us so far.</p>
<p>Of course, what is true for document databases is not necessarily true in the SQL world. While we impose no constraints on the relationships between documents, we are pretty strict about the structure of documents themselves (especially since they are being read from OCaml, a strict static language), and there is no simple way of representing &#8220;Foo holds a list of Bars&#8221; in SQL without a foreign key (in CouchDB, it&#8217;s standard JSON&#8230;)</p>
<p>This is the important information here: those are <em>structural</em> constraints, they define a data structure in the database that must be respected by your <em>implementation</em>, as opposed to <em>non-structural</em> constraints that merely define relationships between higher-level entities and must be respected by your <em>users</em>. Identify the latter, and consider whether you can shift them to <em>a posteriori</em> constraints instead.</p>
<p><small>Article Image © Calsidyrose — <a href="http://www.flickr.com/photos/calsidyrose/5836796139/">Flickr</a></small></p>
]]></content:encoded>
			<wfw:commentRss>http://www.nicollet.net/2011/06/get-rid-of-constraints/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Rewrite Your Code</title>
		<link>http://www.nicollet.net/2011/05/rewrite-your-code/</link>
		<comments>http://www.nicollet.net/2011/05/rewrite-your-code/#comments</comments>
		<pubDate>Tue, 24 May 2011 08:32:23 +0000</pubDate>
		<dc:creator>Victor Nicollet</dc:creator>
				<category><![CDATA[Imperative]]></category>
		<category><![CDATA[Agile]]></category>
		<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Productivity]]></category>
		<category><![CDATA[Strategy]]></category>

		<guid isPermaLink="false">http://www.nicollet.net/?p=2386</guid>
		<description><![CDATA[Writing code relies on four kinds of decisions: What algorithm can implement this feature? How is that algorithm best written in that specific language? What platform quirks and subtle edge cases must be accounted for? How does this code fit in with the rest of the application? Regardless of team experience or preliminary analysis, some [...]]]></description>
			<content:encoded><![CDATA[<p style="text-align: center;"><img class="size-full wp-image-2389 aligncenter" title="backhoe" src="http://www.nicollet.net/wp-content/uploads/2011/05/backhoe.png" alt="" width="650" height="100" /></p>
<p>Writing code relies on four kinds of decisions:</p>
<ul>
<li>What algorithm can implement this feature?</li>
<li>How is that algorithm best written in that specific language?</li>
<li>What platform quirks and subtle edge cases must be accounted for?</li>
<li>How does this code fit in with the rest of the application?</li>
</ul>
<p>Regardless of team experience or preliminary analysis, some of these decisions will be incorrect. Maybe the algorithm failed to take into account the unusual distribution of real-world data ; maybe there was a better way to write it ; maybe there&#8217;s a subtle bug that will not be discovered for weeks ; maybe a possible code reuse has not been identified during the design phase&#8230; or maybe the customer requirements that the feature was based on were not actually adapted to the customer needs.</p>
<p>Such bad decisions get in the way of users, but they also hinder developers, who have to regularly work around existing bad decisions, which in turn causes more bad decisions to be made in recurrent &#8220;lesser of two evils&#8221; situations.</p>
<p>It is a good idea to go back on your bad decisions and make new ones instead. They will not necessarily be good, but at least they will address some of the problems with the old ones.</p>
<p><strong>Don&#8217;t try to go back on everything at once</strong>. Most of the time, the shortcomings of a decision can be identified in hindsight, change too many things at once and hindsight will be lost. In particular, throwing away non-trivial portions of code (anything beyond a single function) in order to rewrite it from scratch is quite risky, especially since it might also discard good decisions that would be hard to retrieve.</p>
<p><strong>Don&#8217;t make your code difficult to change</strong>. Going back on your decisions will involve rewriting code. Lots of it. So far, most of the code in the RunOrg project has been rewritten at least three times. Make sure your language, frameworks, libraries and unit tests all work together to make it easy to evolve specific parts of your code to change decisions. The worst situation for a project to be in is <em>code freeze</em> — changing code is forbidden because it&#8217;s too risky and it might break something. If you suspect that your project might be heading that way, immediately drop everything you are doing and bring your project back to an acceptable state ; if you are not allowed to do so, make sure you send out a warning to anyone who might need to know.</p>
<p><strong>Don&#8217;t make too many decisions</strong>. This is usually spelled out as YAGNI : You Ain&#8217;t Gonna Need It. If there is currently no need for a given feature, other than the fact that it should remain possible in the future, then don&#8217;t implement it. Implementing it will involve making many decisions about how it should happen, and lack of practical application will increase the odds that those decisions are wrong.</p>
<p><strong>Don&#8217;t be afraid to go back on huge decisions</strong>. Weeks ago, an initial decision we made on the RunOrg project turned out to have huge performance implications. I was faced with two choices : keep that decision, and manually optimize the locations where the performance suffered the most (this involved manually handling caching and batches) ; or go back on that decision, re-architecture the entire database access system and propagate those changes throughout literally half the project, in order to allow automatic caching and batch construction in ways that manual optimization could never allow. The rewrite took me four days, with some aftershocks being felt several days afterwards (strangely enough, changing 20k lines of code resulted in only four fairly obvious bugs).</p>
<p>What does your decision-making process or pipeline look like? What does your <em>decision postmortem and reversal process</em> look like? How often do you go back on your decisions?</p>
<p><small>Article image © Barb Crawford – <a href="http://www.flickr.com/photos/barbcrawford/2602483620/" target="_blank">Flickr</a></small></p>
]]></content:encoded>
			<wfw:commentRss>http://www.nicollet.net/2011/05/rewrite-your-code/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>On NoSQL Design Tips</title>
		<link>http://www.nicollet.net/2011/04/on-nosql-design-tips/</link>
		<comments>http://www.nicollet.net/2011/04/on-nosql-design-tips/#comments</comments>
		<pubDate>Fri, 29 Apr 2011 09:55:46 +0000</pubDate>
		<dc:creator>Victor Nicollet</dc:creator>
				<category><![CDATA[Imperative]]></category>
		<category><![CDATA[Architecture]]></category>
		<category><![CDATA[CouchDB]]></category>
		<category><![CDATA[Lokad]]></category>
		<category><![CDATA[NoSQL]]></category>
		<category><![CDATA[RunOrg]]></category>

		<guid isPermaLink="false">http://www.nicollet.net/?p=2359</guid>
		<description><![CDATA[Cloud computing expert Joannes Vermorel posted a few design tips for your NoSQL app two weeks ago. As a short anecdote, Joannes is the man who (through a rather unintended and surprising chain of events) prompted us to move from SQL to CouchDB six months ago on the RunOrg project. So, here are my very [...]]]></description>
			<content:encoded><![CDATA[<p>Cloud computing expert Joannes Vermorel posted <a href="http://vermorel.com/journal/2011/4/5/a-few-design-tips-for-your-nosql-app.html" target="_blank">a few design tips for your NoSQL app</a> two weeks ago. As a short anecdote, Joannes is the man who (through a rather unintended and surprising chain of events) prompted us to move from SQL to CouchDB six months ago on the RunOrg project. So, here are my very own thoughts overlayed on top of Joannes&#8217; five-point article :</p>
<p><strong>You need an O/C (Object-to-Cloud) mapper</strong> : From my experience, almost all read-only code in the application involves some form of object graph traversal : select a root object, extract a list of related objects, filter the list based on a predicate&#8230; Most of the traditional Object-Relational impedance mismatch is a consequence of SQL&#8217;s insistence on performing most of that traversal on the database server using joins in a conceptually parallel environment (every non-discarded row is going to be treated the same by the join) where object-oriented strategies dictate that each object should be able to react differently based on its type. So, SQL-and-Objects leaves you with a sprinkle of duplicate traversal logic across tiers, and some low-performance application-layer graph traversal where polymorphism requires it. And by low-performance, I mean two hundred one-item requests in a row :</p>
<pre style="padding-left: 30px;">foreach (Document document in documents)
  owners.add(document.getOwner());</pre>
<p>And there&#8217;s no simple way of dealing with this, short of dictating that documents don&#8217;t have owners, they have owner identifiers that can then be used to retrieve the actual owners from the database (which may be the case for persistent documents, but is a leaky abstraction that prevents the rest of your application from creating temporary in-memory documents). With a few exceptions, in NoSQL, there are no joins. On the one hand, this reduces the expressiveness of the data storage layer, but this also has the ironic consequence of making NoSQL more optimal for application-driven graph traversal. Adding insult to injury is that most key-value stores implement bulk queries which, when appropriately leveraged by the mapper, transparently improve performance. <strong>Breathe</strong> (the CouchDB mapper we use) does this :</p>
<pre style="padding-left: 30px;">let owners = documents |&gt; Breathe.batch (fun doc -&gt; doc # get_owner)</pre>
<p>Since <strong>Breathe</strong> is purely functional and based on monads, any requests performed by the many <code>doc # get_owner</code> calls can be merged into a single CouchDB request using the bulk API.</p>
<p><img class="alignright size-full wp-image-2229" style="margin-left: 10px;" title="logo" src="http://www.nicollet.net/wp-content/uploads/2011/02/logo.png" alt="" width="175" height="150" />Another reason why mappers are necessary is that unlike SQL, which works on an extremely simple &#8220;send query, get results&#8221; basis, most NoSQL solutions involve complex idioms using basic primitives. For instance, when dealing with CouchDB, you cannot press the &#8220;use transactions&#8221; button and expect it to work. CouchDB transactions involve posting a document, receiving a collision notification, downloading the new version of the document, <em>applying the transformation again</em>, posting the document again, and so on until you either run out of retries or no collision happens. And collision detection involves keeping a revision number around and passing it to the API, so your code also has to deal with whether there&#8217;s a cached in-memory copy of the object that you can use to grab the revision or if you will have to actually query the database. The public API provides nothing to relieve this pain, so it is up to the mapping system to do this :</p>
<pre style="padding-left: 30px;">let publish id =
  let publish article = { article with published = true } in
  Article.DB.transaction id (Article.DB.update publish)</pre>
<p>The mapper will grab the initial page value from the database if it was not already in the cache, try the update 20 times if collisions happen, and place the latest version of the page in the cache. Using raw API calls, this would take ten lines and probably contain many bugs. And did I mention that the function above can use the bulk transaction API if placed in a loop ?</p>
<p><strong>Performance is obtained mostly by design</strong> : which I would rephrase as &#8220;your O/C mapper abstracts operations, not architecture&#8221;. In short, there&#8217;s no way for your application to shield itself from the architectural consequences of NoSQL storage. For instance, CouchDB just <em>doesn&#8217;t</em> do multi-object transactions, so this is something that the application will have to take into account — this means that most operations are idempotent so they can be safely retried, and most cross-object invariants are expected to be <em>eventually</em> consistent.</p>
<p>And, indeed, you cannot write an application and then leave it to the database administrators to create an index wherever necessary. The good news is that most NoSQL solutions, by their simple design, make it hard to do any inefficient operations (it takes one line to do a full table scan in SQL, but it&#8217;s a lot harder to do with NoSQL, and a lot more obvious when it happens — why am I querying <code>_all_docs</code> again?)</p>
<p>The 20 updates/second limit should be taken with a grain of salt. NoSQL performance is certainly predictable, but sweeping generalizations never apply and there are many bottlenecks that you can hit in any particular order — application-database bandwidth, database-disk bandwidth, database processor usage&#8230; right now, CouchDB handles 130 updates/second (an update is a GET-PUT cycle), with the bottleneck being that the updates are synchronous on the application side, so it doesn&#8217;t send as many requests as the database could possibly handle (if you run two application instances, you can get 260 updates/second). This is a fairly healthy bottleneck to have, because it means adding more application instances solves it, and this just happens to be our scaling strategy <img src='http://www.nicollet.net/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' /> </p>
<p><strong>Go for a contract-based serializer</strong> : automatic serialization is great for one-shot messages that run between instances of the application, but a pain as soon as you need to persist them and, consequently, have the storage format evolve along with the application.</p>
<p>I have no opinion on the XML-vs-JSON debate, though I use JSON myself because it fits with both the database and client layers of my project.</p>
<p>In general, I find the use of NoSQL to be no excuse for making a mess in the database. Sure, you&#8217;re not constrained by a schema anymore, but data in the database is what you are going to manipulate ! If there isn&#8217;t a clean description of what you can find in the database, that data might as well be lost, because you will not be using it. Using application-wide conventions on data storage and representation helps developer A manipulate data generated by developer B without requiring a meeting to get everyone on the same page. Storing data in an opaque blob representation is fine for short-lived situations, obscure and rare edge cases (such as when the Lokad.Cloud queue offloads large objects into blob storage because it <em>has</em> to put them somewhere), but most of your data should have a clear documented schema. Your database does not need it, but your team does.</p>
<p><strong>Entity isolation is easiest path to versioning </strong>:<strong> </strong>You Ain&#8217;t Gonna Need I (yes, I disagree with Joannes on this one). Unless something went terribly wrong with your code base, it&#8217;s  to duplicate your entity class when the two versioning paths actually start to diverge (and even then, you can certainly keep some common bits factored out nicely) instead of doing so from the very beginning.</p>
<p><strong>With proper design, aka CQRS, needs for SQL drop to near-zero</strong> : Well, let&#8217;s face it: SQL provides a toolset that goes beyond anything any NoSQL solution in terms of general flexibility and expressiveness (including the ability to simulate key-value stores using blobs,should the need arise). NoSQL solutions remain niche solutions to niche problems because this is the way they are designed, possibly out of a fear of inadequacy and the general message &#8220;NoSQL should be used where SQL is not good enough&#8221; — most NoSQL users still use an SQL database as the primary storage, and dump data into NoSQL for a small subset of queries that need the performance boost. Companies like Lokad or RunOrg, that embrace NoSQL-only architectures, are still rare <em>and still struggling to understand how, exactly, some things should be done in NoSQL that would take one line of SQL code</em>. You don&#8217;t see <em>Design Tips for SQL</em> articles being written in 2011 <img src='http://www.nicollet.net/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' /> </p>
<p>Even with CQRS, requirement changes have a much greater impact on NoSQL application architecture than they would with SQL, because SQL mappers despite all their flaws still manage to provide a significant amount of shielding. What used to be a matter of adding a few tables and doing the right joins now turns into an uphill battle of propagating the appropriate de-normalized data into the right places to allow for an index to be built (which, by the way, leads me to believe that Aspect-Oriented Programming might be more adapted for NoSQL than standard OOP) that simply isn&#8217;t possible without a nearly extremist agile culture. Sure, with CQRS and event sourcing, this isn&#8217;t a <em>losing</em> uphill battle we&#8217;re fighting anymore, but it&#8217;s still uphill. I&#8217;m pretty sure the reason why Joannes does not see things this way is that Lokad <em>has</em> an extremist agile culture.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.nicollet.net/2011/04/on-nosql-design-tips/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Object-Oriented For the Win.</title>
		<link>http://www.nicollet.net/2011/04/object-oriented-for-the-win/</link>
		<comments>http://www.nicollet.net/2011/04/object-oriented-for-the-win/#comments</comments>
		<pubDate>Fri, 01 Apr 2011 21:26:03 +0000</pubDate>
		<dc:creator>Victor Nicollet</dc:creator>
				<category><![CDATA[Imperative]]></category>
		<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Design Patterns]]></category>
		<category><![CDATA[Learning]]></category>

		<guid isPermaLink="false">http://www.nicollet.net/?p=2335</guid>
		<description><![CDATA[Readers, beware: this is an opinionated piece. Feel free to curse appropriately in the comments below, should you find the need to do so. I&#8217;m writing this because of yet another encounter with an object-oriented zombie, and one without the excuse of being fresh out of school. They all stand united behind the motto that [...]]]></description>
			<content:encoded><![CDATA[<p>Readers, beware: this is an opinionated piece. Feel free to curse appropriately in the comments below, should you find the need to do so. I&#8217;m writing this because of yet another encounter with an object-oriented zombie, and one without the excuse of being fresh out of school. They all stand united behind the motto that <strong>Object-oriented programming makes your code easy to reuse, debug, maintain and extend</strong> and I happen to disagree.</p>
<p>Something interesting happened in the Pacific Ocean after World War II: during the war, American soldiers had set up air strips on Pacific islands, complete with little control towers that had radio stations. And the natives observed that those soldiers would speak into a weird piece of metal, and a giant iron bird would land from the sky and deliver food and supplies. Then, the war ended, the soldiers went home, and the natives tried to bring the giant iron birds back by building shacks that looked like legitimate control towers, and spoke strange words into rocks that looked like legitimate microphones, and it worked!</p>
<p><a href="http://www.flickr.com/photos/frankjuarez/461208642/"><img class="alignright size-full wp-image-2337" style="margin-left: 15px;" title="flower-1" src="http://www.nicollet.net/wp-content/uploads/2011/04/flower-1.jpg" alt="" width="100" height="100" /></a>No, it didn&#8217;t. Cargo planes did not come back because it was not the shape of the microphone and antenna that brought in the planes, it was their invisible electromagnetic properties. These are the <em>cargo cults</em>: imitating the outside appearance of things that work and expecting the imitation to work as well. The same happens with object-oriented programming — the use of classes, or inheritance, or design patterns is not the reason why good code is good: they are merely tools which happened to be applied by knowledgeable and skilled programmers in order to avoid problems that they knew would happen if they did not preempt their appearance. To look at their code, find out that they used classes, and deduce that classes were responsible for their success is about as rational as expecting to steal the strength of your defeated foe when you eat his flesh. <strong>Object-oriented programming can only solve design problems you know you need to solve.</strong></p>
<p>Imagine a piece of old, spaghetti C code written in procedural style, with functions calling each other and accessing global variables all over the place. I suspect many young graduates these days never had to deal with these — and I wouldn&#8217;t have either, were it not for my unrequited love for mental anguish and programmer pain. Back to the point: in your mind, draw an arrow from A to B if function B calls function A or accesses global variable A. This is a dependency graph: if you were to change the run-time behavior of A, then the behavior of B would almost certainly change as well. By following all outgoing arrows from the point you altered, you can find every element that could possibly be affected by it.</p>
<p>No programming style or design methodology on earth is going to change this ; whether object-oriented or aspect-oriented or functional, functions call other functions, and changing one part will have an impact on many other parts. That is something we accept, because propagating changes is part of the programming job, and we have even elevated refactoring — the propagation of the <em>absence</em> of changes — to the level of a good thing since sliced bread.</p>
<p><a href="http://www.flickr.com/photos/tanaka_juuyoh/2585559389/"><img class="alignleft size-full wp-image-2341" style="margin-right: 15px;" title="flower-2" src="http://www.nicollet.net/wp-content/uploads/2011/04/flower-2.jpg" alt="" width="180" height="240" /></a>And since we have to track down all the dependencies of a given entity sooner or later, our national sport is to make it as easy as possible. One way is to let the compiler or build process help — compilers detect type mismatches that are usually a hint of non-propagated major behavior changes, while automated unit tests and regression tests handle more subtle changes, and both of these are fairly independent of your programming style. Another way is to willingly reduce the number of dependencies through architectural constraints — Model-View-Controller prevents a model from being dependent on a controller, because altering a controller should not change the behavior of a model. These are fairly common ways to reduce static dependencies.</p>
<p>Then, there are the dynamic dependencies — functions which alter the behavior of other functions at <em>run-time</em>. Remember those global variables from the C program? If function A writes to the global variable and function B reads from the global variable, then calling A will probably change the behavior of B. Runtime dependencies are fairly obvious to map: every time a function writes to a global variable, draw an arrow from the function to the global variable and mark the function as having a side-effect. Then recursively do the same for every function that calls it, and so on until you run out of functions. Your dependency graph just became a lot hairier than it was before.</p>
<p>Such dependencies are the reason why working on the report-printing feature in your invoicing software somehow managed to break the database storage : even though both modules are independent and the static dependency tree for report-printing does not flow into the static dependency tree for database storage, there are a handful of global variables that allow your changes to follow dynamic dependencies and leave a flag unset when it should have been set and ultimately blow up the nuclear silo.</p>
<p>So, when you have global mutable state, then changes can jump from module to module through those pieces of global state. Procedural programming mostly dealt with this by cleanly wrapping up any global state in a module, and having modules interact with each other by means of interfaces that managed to guarantee some invariants on the global state. But still, some problems remained. I am reminded here of the code to Dungeon Crawl, an open-source game written in procedural style, where the module responsible for dealing damage to the player will make frequent calls to the module responsible for outputting messages about the damage being dealt (&#8220;The hobgoblin hits you!&#8221;). As written, you cannot use the damage-dealing code without displaying messages, so writing an AI module that tries to predict the outcome of an attack is harder than it seems, <em>because it cannot use the existing damage-dealing code to run predictions</em>.</p>
<div id="attachment_2343" class="wp-caption aligncenter" style="width: 360px"><a href="http://www.dungeoncrawl.org/"><img class="size-full wp-image-2343 " title="ss-dos-sm" src="http://www.nicollet.net/wp-content/uploads/2011/04/ss-dos-sm.png" alt="" width="350" height="250" /></a><p class="wp-caption-text">They actually have sludge elves.</p></div>
<p>Functional programming solved the global mutable state issue by eliminating global mutable state altogether. In the aforementioned example, the damage-dealing module would return the new damaged player along with a list of messages that were generated, and these messages could then be discarded silently in the AI subroutine, or forwarded to the screen in the main routine. I am a big fan of functional programming, but let&#8217;s face it: not only is it harder on the brain than good old set-variable-values-everywhere programming, but it does not suit well to situations which have an implicit reliance on global mutable state — such as web servers connected to a persistent database. I look forward to a pure functional friendly database, but it&#8217;s not there yet.</p>
<p>Both of these solutions rely on <em>encapsulation</em>: whatever happens to be the implementation of a procedural or pure functional module is hidden, and only available through those functions. This creates a dependency bottleneck, so any changes performed within the implementation can only leak out through the interface in tightly controlled ways &mdash; at compile-time, those are called contracts, at run-time they are called invariants. </p>
<p>Object-oriented programming works by deprecating global mutable state. It&#8217;s not completely gone, and some programmers yearning for the good old days still try to enjoy some global goodness by using the Gang of Four Singleton pattern, but it&#8217;s mostly deprecated. The sickle and hammer of object-oriented programming are dependency injection and late binding, and it uses them to break down the structure of static dependency graph bourgeoisie.</p>
<p>Yes, object-oriented programming is all about building the dependency graph at run-time. Then again, a program that generates machine code at run-time does the same. Object-oriented programming merely provides a set of tools that are easier on the human brain than the extremes of generating machine code at run-time. As an aside, keep in mind that it is not the only toolbox available to you: any language with closures can do the same, and although closures are both more elegant and slightly more complex to handle in the general case, they have found their way into event-based programming because an event handler is just, well, a closure. This is why jQuery passes functions around everywhere instead of requiring the user to implement an IAjaxResultVisitor interface: it&#8217;s shorter, but still fairly easy to understand in that context. If you push things far enough, most objects are usually nothing more than dictionaries of closures.</p>
<p>Back to the point: writing a proper object-oriented program with those tools is about the same as creating a statue out of stone using sculpting tools, and inheritance will not turn someone into an object expert any more than holding a chisel turns them into a mutant ninja turtle disambiguation page. How does it happen?</p>
<p>The original problem with procedural programming was that there was no easy way to unplug the damage-dealing module from the message-printing module. Dependency injection means that the damage-dealing module (which is now an object) does not know about the message-printing module (which is now also an object), instead, whoever needs to use damage-dealing code must provide it with a message-printing object that will be used to print whatever messages come up. That message-printing object might actually print those messages to the screen, or it might silently discard them, or it might keep them around and only display them to the screen if a specific event happens, or it might be an unit testing mock, or any other amount of different behaviors. In terms of dependency graphs, there is no static connection between damage-dealing and message-printing: the program decides at run-time what message-printing object should be bound to what damage-dealing object, and can create new objects on the fly should the situation require it.</p>
<p>Aside from that, there are no significant architectural benefits to using objects or classes &mdash; although touted as an object-oriented achievement, <em>encapsulation</em> is commonly available in functional and procedural programming as well, and using <code>object.function()</code> instead of <code>function(object)</code> is a matter of taste, not architecture. </p>
<p>As a final word, I am fairly doubtful of the ability of schools to teach their students about good object-oriented programming. They can certainly teach them how to use the tools, but the pains that object-oriented programming is meant to solve only become obvious when you have to work with a large, multi-developer project over a long duration and with changing requirements &mdash; anything less, and your pains will be subtle feelings of awkwardness instead. Definitely not something you can learn from. <strong>If you have never written an unmaintainable piece of mud, you cannot know how object-oriented programming can keep you from writing one.</strong> </p>
<p>You may now dish out punishments in the comment box below. Have fun.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.nicollet.net/2011/04/object-oriented-for-the-win/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Verification Bandwidth</title>
		<link>http://www.nicollet.net/2011/02/verification-bandwidth/</link>
		<comments>http://www.nicollet.net/2011/02/verification-bandwidth/#comments</comments>
		<pubDate>Thu, 17 Feb 2011 16:05:35 +0000</pubDate>
		<dc:creator>Victor Nicollet</dc:creator>
				<category><![CDATA[Imperative]]></category>
		<category><![CDATA[Productivity]]></category>

		<guid isPermaLink="false">http://www.nicollet.net/?p=2267</guid>
		<description><![CDATA[We&#8217;re all in the software business, so you already know this to be true. Software doesn&#8217;t work. Think about it: you release your current project and then it&#8217;s 1° on time, 2° full-featured and 3° bug-free. Pick any two. The are many technical solutions for delivering software without bugs. Architectures. Design methodologies. Frameworks. Programming languages. [...]]]></description>
			<content:encoded><![CDATA[<p>We&#8217;re all in the software business, so you already know this to be true. <strong>Software doesn&#8217;t work</strong>. Think about it: you release your current project and then it&#8217;s 1° on time, 2° full-featured and 3° bug-free. Pick any two.</p>
<p>The are many technical solutions for delivering software without bugs. Architectures. Design methodologies. Frameworks. Programming languages. I was fairly convinced for a while that Objective Caml was the ultimate solution to software bugs. <strong>These solutions don&#8217;t work</strong>.</p>
<p>Lets take a simple example. A web site. Users can send private message to each other, and each message has a web address so I can send it to the user in an e-mail: «You have received a message, click here to view it». And then, a web developer writes the code for that address: grab the message identifier from the URL, ask the database for the message contents, mash them up with some HTML and send them to the viewer. That&#8217;s a bug right there. What about messages that don&#8217;t exist?</p>
<p>If you&#8217;re on your average web platform, the server will spit out an error message along the lines of «Silly human, this value is NULL» and it will obviously happen during your demo to your investors. Bad tech start-up, no funding cookie. The good news is that with Objective Caml is that values cannot be NULL — <a href="http://sadekdrobi.com/2008/12/22/null-references-the-billion-dollar-mistake/" target="_blank">nullable references are a billion dollar mistake</a> that the creators of ML wisely avoided. So, instead, you will get a compiler error about using a can-be-null type when a cannot-be-null type was expected. The buggy code will never reach production, the investors will not see anything out of the ordinary and you will get your funding.</p>
<p>But there&#8217;s still a bug. At no point did the code check that the viewer was indeed allowed to view that private message. That&#8217;s not a bug Objective Caml or any other automated tool on earth can detect for you. Unless you explain it to them — but if you forget to check this, you will probably also forget to teach the tool to detect that you forgot to check this. Foiled by Occam&#8217;s Razor once more. This is a human problem : if no human in the entire development process thought «<em>we really need to make sure private messages really are private</em>» then you can be certain that no automated tool will think of that for them. Until a mischevious user finds out and you get sued.</p>
<p>Our limited mental capabilities mean that every single human project since we started sharpening sticks at throwing rocks at each other follows the exact same structure: baby steps are interspersed with verification sessions that help keep the entire project on course. To fall back on a classic analogy, wow do we build bridges? The architect comes up with a general plan: it&#8217;s going to be this kind of bridge, going from here to there, build using these materials. Then, all kinds of verifications happen: technical (are these materials going to hold?), functional (is it wide enough for a two-lane road?), organizational (can we fund this?). The plans are adjusted to take any new elements into account, and the cycle continues until the bridge is done. And it turns out no one thought about oscillation frequencies in bridges and you get the Tacoma Narrows Bridge bug.</p>
<p>Any project is going to evolve under the effect of two distinct forces: implementation and verification. Both ingredients are necessary for success. Not enough implementation — not enough code, not enough plans, not enough features — is usually quite obvious: just count the features and you know you&#8217;re not done yet. Not enough verification, on the other hand, is a lot harder to detect because by definition you would need more of it to find out that you need more. As a project lead, this is an extremely important metric to manage: <strong>verification bandwidth</strong> — the amount of pre-implementation constraints and post-implementation feedback that is collected and applied to the project — will make the difference between a quality product and a dud.</p>
<p>And we already have a lot of tools for doing just that. Specifications aim to crystallize a lot of pre-implementation requirements into document form, which makes it easier to apply to the project than if they were just random comments floating around in collective memory or e-mails. Well-written annotations in a specification document can be a gold mine during implementation. And when bugs are detected after the implementation, bug tracking tools help bridge the gap between testers and implementers.</p>
<p>But there&#8217;s more to this than just good specifications and good bug tracking.</p>
<p>Agile folks recognize that while pre-implementation requirements are useful, post-implementation feedback is a much more valuable source of information. The various flavors of Agile development all have this in common: to make it as easy as possible to collect post-implementation feedback and apply it to the project. Weekly Scrum meetings, hands-on demos to stakeholders, continuous user testing with short cycles, are all ways of improving collection ; frequent refactoring, high quality code and evolving designs are all ways of improving the team&#8217;s ability to incorporate feedback into the project.</p>
<p>In fact, the shorter the feedback cycle, the better. This is the general idea behind automated testing: why let frail flesh-and-blood humans handle the testing if a computer can do it for you? Static type systems eliminate the need for a tester to painstakingly traversing all the pages of your PHP site looking for broken links and null variable errors. Automated Unit Tests and Regression Tests let you refactor your entire application without having a human tester look at even one screen. And what the automated tests cannot find — rely on code reviews or human testers to identify issues, and then retro-fit your automated test suite to detect that issue. You&#8217;re trading off a small bit of implementation for what amounts to a large verification payoff.</p>
<p>And self-feedback is just as important. Having experienced developers who can identify problems in code on their own <em>before it&#8217;s even written</em> are perhaps the single most important source of software quality. Developers who understand the problem domain and apply common sense efficiently are certainly a huge asset as well.</p>
<p>In any given project, the sources of verification information are the following:</p>
<ul>
<li>Developers &#8211; costly to acquire, but the most efficient kind there is</li>
<li>Compilers and static analysis tools</li>
<li>Automated tests</li>
<li>Stakeholder feedback</li>
<li>Dedicated testers &#8211; especially if they can communicate with developers directly</li>
<li>Written specifications</li>
<li>End user feedback</li>
</ul>
<p>Try looking at your current project in terms of verification bandwidths: what are your primary sources of feedback? What are your bottlenecks? How can you improve?</p>
]]></content:encoded>
			<wfw:commentRss>http://www.nicollet.net/2011/02/verification-bandwidth/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Open-Source</title>
		<link>http://www.nicollet.net/2010/08/open-source/</link>
		<comments>http://www.nicollet.net/2010/08/open-source/#comments</comments>
		<pubDate>Wed, 25 Aug 2010 08:59:33 +0000</pubDate>
		<dc:creator>Victor Nicollet</dc:creator>
				<category><![CDATA[Imperative]]></category>
		<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Open Source]]></category>

		<guid isPermaLink="false">http://www.nicollet.net/?p=1918</guid>
		<description><![CDATA[]]></description>
			<content:encoded><![CDATA[<p><img class="aligncenter size-full wp-image-1919" title="ppt1" src="http://www.nicollet.net/wp-content/uploads/2010/08/ppt1.png" alt="ppt1" width="670" height="500" /><img class="aligncenter size-full wp-image-1921" title="ppt2" src="http://www.nicollet.net/wp-content/uploads/2010/08/ppt2.png" alt="ppt2" width="670" height="500" /><img class="aligncenter size-full wp-image-1923" title="ppt3" src="http://www.nicollet.net/wp-content/uploads/2010/08/ppt3.png" alt="ppt3" width="670" height="500" /></p>
]]></content:encoded>
			<wfw:commentRss>http://www.nicollet.net/2010/08/open-source/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Software Inbreeding</title>
		<link>http://www.nicollet.net/2010/06/software-inbreeding/</link>
		<comments>http://www.nicollet.net/2010/06/software-inbreeding/#comments</comments>
		<pubDate>Mon, 21 Jun 2010 05:07:13 +0000</pubDate>
		<dc:creator>Victor Nicollet</dc:creator>
				<category><![CDATA[Imperative]]></category>
		<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Documentation]]></category>
		<category><![CDATA[Start-Up]]></category>
		<category><![CDATA[Strategy]]></category>

		<guid isPermaLink="false">http://www.nicollet.net/?p=1537</guid>
		<description><![CDATA[You&#8217;ve seen one of these painful, horrible business applications, uglier than hell and with no thought put into consistency or usability. No sane person would use them, but they are still used because that&#8217;s what the company paid for and that&#8217;s what the employees are going to use. The root cause of all this suffering? [...]]]></description>
			<content:encoded><![CDATA[<p>You&#8217;ve seen one of these painful, horrible business applications, uglier than hell and with no thought put into consistency or usability. No sane person would use them, but they are still used because that&#8217;s what the company paid for and that&#8217;s what the employees are going to use.</p>
<p>The root cause of all this suffering? Think about it : who is going to write a piece of accounting software?</p>
<p><strong>Choice A</strong> : the competent <strong>former accountant</strong> who happens to know something about programming. But he&#8217;s not an expert, so he uses some PHP4 he stole off the web instead of leveraging open source tools that are too hard for him to change, he writes &lt;marquee&gt; everywhere because he never heard of <a href="http://projects.zoulcreations.com/jquery/growl/" target="_blank">growl</a>, he makes weird mistakes related to unicode (which he believes is a Nazi encryption scheme from 1944) and he steals assorted icons from the 1990 Macintosh world because <a href="http://famfamfam.com/lab/icons/silk/" target="_blank">FamFamFam&#8217;s silk</a> is too esoteric for him to know about.</p>
<p><strong>Choice B</strong> : the competent <strong>programmer</strong> who happens to know something about accounting. [Insert here striking examples about how incompetent programmers can be when dealing with accounting <img src='http://www.nicollet.net/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  ] And he might get bored before the end, because accounting is boring to engineering types (and that would be assuming he even <em>knew</em> there was a need for an accounting application in the first place).</p>
<p>With Choice B, you get a symphony of Ajax-CSS3-HTML5 beauty and pixel-tuned usability, but you can&#8217;t use <a href="http://blog.asmartbear.com/cash-or-accrual-basis-accounting.html" target="_blank">accrual-based accounting</a> because the programmer never heard of it, and you just can&#8217;t use an accounting program that doesn&#8217;t handle accrual-based accounting if you&#8217;re serious about it. <strong>So, you use Choice A, which is an ugly-as-hell, retina-maiming, CTS-inducing threat to humanity that handles accrual-based accounting</strong>.</p>
<p>People try to solve problems they are familiar with. It does not surprise me in the least when Dharmesh Shah notes <a href="http://onstartups.com/tabid/3339/bid/11978/The-10-Most-Tempting-Software-Startup-Categories.aspx" target="_blank">ten recurring themes for young software start-ups</a> to work on. To wit:</p>
<blockquote><p>1. Project Management / Time Tracking / Bug Tracking<br />
2. Community / Discussion Forums<br />
3. Personalized News Aggregation/Filtering<br />
4. Content Management (website, blog)<br />
5. Social Voting and Reviews<br />
6. Music/Events Location Application<br />
7. Dating and Match-Making<br />
8. Personal Information Management<br />
9. Social Network For ______<br />
10. Photo/video/bookmark/whatever sharing</p></blockquote>
<p>If you&#8217;re a programming genius, not only do you have a good idea of what features these applications should have, but you would actually be standing in line to use them as soon as they are available.</p>
<p>On the other hand, you don&#8217;t wake up every morning to do a little dance, thinking «<em>Woo, this order-printing application will kick so much ass!</em>» And even if you managed to get excited about the project as a technical challenge(<em>Woo, this next-gen F#-and-AJAX application, which happens to print orders, will kick them butts all right</em>!), the tedium of identifying hundreds of fields and entities and relationships and business rules, and typing them in, can&#8217;t really be considered a technical challenge. And F#-and-AJAX won&#8217;t help if your ER diagram is off, so you have to ask the accountant, who will promptly bore you to death with an in-depth explanation of <a href="http://www.internationaltaxreview.com/?Page=10&amp;PUBID=35&amp;ISS=25377&amp;SID=719836&amp;TYPE=20" target="_blank">international VAT deduction rules</a>.</p>
<p>And that&#8217;s a shame, because the non-programming <em>hoi polloi</em> are stuck with software from the 1980s that can&#8217;t be replaced until all the features are replicated by the new solution.</p>
<p>Dealing with software older than yourself is always a traumatizing experience. Think of the children.</p>
<p><small>EDIT: Seth Godin published a post around <a href="http://sethgodin.typepad.com/seths_blog/2010/06/a-sad-truth-about-most-traditional-b2b-marketing.html" target="_blank">being passionate about tax accounting</a> at the same time I published this post&#8230; my sneaky mind control schemes for owning the internet must be working.</small></p>
]]></content:encoded>
			<wfw:commentRss>http://www.nicollet.net/2010/06/software-inbreeding/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Lorem Ipsum</title>
		<link>http://www.nicollet.net/2010/03/lorem-ipsum/</link>
		<comments>http://www.nicollet.net/2010/03/lorem-ipsum/#comments</comments>
		<pubDate>Wed, 03 Mar 2010 21:06:56 +0000</pubDate>
		<dc:creator>Victor Nicollet</dc:creator>
				<category><![CDATA[Imperative]]></category>
		<category><![CDATA[Bugs]]></category>
		<category><![CDATA[Lorem Ipsum]]></category>
		<category><![CDATA[Useless]]></category>

		<guid isPermaLink="false">http://www.nicollet.net/?p=1280</guid>
		<description><![CDATA[Lorem Ipsum is a sample phrase used as a filler in typesetting, to illustrate how some text would look. Here&#8217;s a sample paragraph: Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip [...]]]></description>
			<content:encoded><![CDATA[<p>Lorem Ipsum is a sample phrase used as a filler in typesetting, to illustrate how some text would look. Here&#8217;s a sample paragraph:</p>
<blockquote><p><em>Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.</em></p></blockquote>
<p>This approach to filler text is superior in many ways to most alternatives:</p>
<ul>
<li>It&#8217;s long, certainly longer than &#8220;Test&#8221; or &#8220;AAA&#8221;, so it can fill several lines (and test whether there is an unexpected length limit when saving the data).</li>
<li>Unlike random strings of characters copy-pasted several times, it is split into words of uneven length, spaces between words do not align horizontally or vertically.</li>
<li>It is readily recognizable as random text by any typesetter (or developer) worth its salt.</li>
</ul>
<p>A typical testing strategy, when filling forms by hand, is to copy-paste one or two Lorem Ipsum paragraph to test such things as how the text area reacts, whether it is saved correctly, and so on.</p>
<p>Lorem Ipsum does have some limitations:</p>
<ul>
<li>It&#8217;s written in latin, so it fits nicely in the ASCII range of characters. As such, it does not test for Unicode support.</li>
<li>It contains no quotes of any kind, so no testing of database escaping processing either.</li>
<li>It contains no HTML-specific characters like &lt; or &amp;, so HTML character escaping is not tested either.</li>
<li>For that matter, it does not contain exceedingly long words that would overflow a single line, so you cannot test for this kind of overflow either.</li>
<li>Sometimes, you want to auto-linkify links and URLs.</li>
<li>Sometimes, Skype turns numbers into &#8230; clickable numbers.</li>
</ul>
<p>I need to test these things on my web applications, so I&#8217;ve developed my own version of a &#8220;<strong>Modern Lorem Ipsum</strong>&#8220;:</p>
<blockquote><p><em>Lorem &lt;a href=&#8221;javascript:document.write(&#8221;)&#8221;&gt;ipsum&lt;/a&gt; dòlor sit àmet, consectetur adipisicing élit, sèd do eiusmod tempor incididunt ut labore &amp; dolore magna aliqua. &lt;hr/&gt;Ut enim@minim.com veniam, quis nostrud exercitation `&#8221;ullamco laboris nisi &amp; aliquip ex æ commodo consequat. Duis aute irure dolor 01 23 45 67 89 in reprehenderit in voluptate velit esse cillum dolore `eu fugiat https://nulla.biz/pariatur. aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa </em></p>
<p><em>Excepteur [u]sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit &#8216;anim id est laborum.[u] </em><em>Lorem&#8221; ipsum dòlor sit àmet, consectetur adipisicing élit, sèd do eiusmod tempor http://incididunt.ut.com/labore &amp; dolore magna aliqua. &lt;b&gt;Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex æ commodo consequat.&lt;/b&gt; &#8220;Duis aute irure dolor in &#8216;reprehenderit in voluptate velit &#8212; esse cillum dolore eu fugiat nulla pariatur.</em></p></blockquote>
<p>Feel free to copy-paste it away. WordPress certainly does seem to have a hard time with these long lines in post bodies—I wonder if it happens in comments as well <img src='http://www.nicollet.net/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://www.nicollet.net/2010/03/lorem-ipsum/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>How to build a page client-side?</title>
		<link>http://www.nicollet.net/2010/02/jquery-build-page/</link>
		<comments>http://www.nicollet.net/2010/02/jquery-build-page/#comments</comments>
		<pubDate>Mon, 15 Feb 2010 19:47:00 +0000</pubDate>
		<dc:creator>Victor Nicollet</dc:creator>
				<category><![CDATA[Imperative]]></category>
		<category><![CDATA[Javascript]]></category>
		<category><![CDATA[jQuery]]></category>
		<category><![CDATA[Performance]]></category>

		<guid isPermaLink="false">http://www.nicollet.net/?p=1264</guid>
		<description><![CDATA[The basic philosophy of jQuery is to start with some existing HTML sent over vanilla HTTP by the server. That HTML should be all you need (so that people without a JavaScript-enabled browser can still use the web site). Then, jQuery enhances that HTML by adding new behavior (usually changing the properties of existing elements, [...]]]></description>
			<content:encoded><![CDATA[<p>The basic philosophy of jQuery is to start with some existing HTML sent over vanilla HTTP by the server. That HTML should be all you need (so that people without a JavaScript-enabled browser can still use the web site). Then, jQuery enhances that HTML by adding new behavior (usually changing the properties of existing elements, sometimes adding new elements).</p>
<p>This is very useful for small pieces of behavior, but writing complete and complex components is hard for several reasons:</p>
<ul>
<li>A partial view strategy is required on the server side to insert the appropriate HTML in the appropriate location (as opposed to leaving an empty hole and having the component generate its own HTML).</li>
<li>If the behavior of your component is complex, then there will be a lot of parsing going on. A typical example would be sorting a table by a &#8220;date&#8221; column—since the date format in itself cannot be parsed (culture-dependent and may contain &#8220;Yesterday&#8221;, &#8220;13 seconds ago&#8221; and similar shortcuts).</li>
<li>Sometimes, the server needs to add information that is not visible, but is needed by the JavaScript. The format for sending this data (attribute, hidden field&#8230;) is difficult to document and type-check.</li>
<li>Selecting precisely the right fields in a blob of HTML, without hitting any others, is hard, especially for components that may later contain sub-components. Class-based selection is slow, id-based selection involves heavy logistics to move the identifiers around, and complete traversal takes a while and breaks if the HTML changes.</li>
</ul>
<p>My preferred approach to JavaScript components is to receive JSON-formatted data from the server (easy to parse) from which I construct the DOM elements I need and capture them at the same time.</p>
<pre style="background: #ffffff none repeat scroll 0% 0%; color: #000000; padding-left: 30px;"><span style="color: #000084; font-weight: bold;">var</span> $comment = $(<span style="color: #0000ff;">'&lt;div&gt;&lt;img/&gt;&lt;span/&gt;&lt;div/&gt;&lt;/div&gt;'</span>)
  .addClass(<span style="color: #0000ff;">"comment"</span>);

<span style="color: #000084; font-weight: bold;">var</span> obj =
{
  $self : $comment,

  $img  : $comment.children(<span style="color: #0000ff;">'img'</span>)
          .attr(<span style="color: #0000ff;">'src'</span>,data.imgUrl),

  $name : $comment.children(<span style="color: #0000ff;">'span'</span>)
          .text(data.authorName)
          .addClass(<span style="color: #0000ff;">'authorName'</span>),

  $body : $comment.children(<span style="color: #0000ff;">'div'</span>)
};

$.each(data.text,<span style="color: #000084; font-weight: bold;">function</span>(k,t){
  $(<span style="color: #0000ff;">'&lt;p/&gt;'</span>).text(t).appendTo(obj.$body);
});

<span style="color: #000084; font-weight: bold;">return</span> obj;</pre>
<p>The point is that you then have access, through the returned object, to all the relevant elements within the comment, so that you may target them with effects without any risky selector-based magic. Besides, if the HTML format of comments changes, you will only have to change the code above and nothing else.</p>
<p>And of course, using <code>text()</code> escapes any dangerous HTML you might have.</p>
<p>To make the above appear in your code, all you have to do is:</p>
<pre style="background: #ffffff none repeat scroll 0% 0%; color: #000000; padding-left: 30px;"><span style="color: #000084; font-weight: bold;">var</span> $commentsList = $(<span style="color: #0000ff;">'#my-comments-list'</span>);

$.each (comments, <span style="color: #000084; font-weight: bold;">function</span>(i,c){
  <span style="color: #000084; font-weight: bold;">var</span> obj = $comments[i] = renderComment(c);
  obj.$self.appendTo($commentsList);
});</pre>
<p>This is usually where you hit a performance wall, because this is one of <a href="http://www.artzstudio.com/2009/04/jquery-performance-rules/" target="_blank">the slowest ways of using jQuery</a> on a web page.</p>
<p>I&#8217;ve been in this situation recently on a smallish website that basically displays a list of contacts invited to various events as a 10-column/300-row table that includes additional functionality such as:</p>
<ul>
<li>Dynamically add or remove new rows (with server-side confirms)</li>
<li>Rows are grouped together, and groups can be collapsed and expanded</li>
<li>Clicking on rows opens a modal editor, modifications are propagated back to the table</li>
<li>The data and formatting for certain rows depend on some other rows</li>
</ul>
<p>The initial approach was exactly as described above: every cell was constructed as <code>$('&lt;td/&gt;')</code>, classes and attributes were applied to it, then all cells were inserted into rows constructed as <code>$('&lt;tr/&gt;')</code>, and these in turn were appended to the table tbody. Since some parts of the table were clickable to achieve various effects, jQuery&#8217;s <code>click()</code> function was used to add the appropriate event handlers, and the event handlers were closures that contained all relevant information about what row had to be collapsed or what element had to be removed.</p>
<p>The average time for rendering all of this was a solid <strong>2200</strong>ms on Firefox 3.5, which felt about as dynamic as a dead tortoise nailed to a slab of concrete. For comparison purposes, rendering the data server-side and sending it to the client took about <strong>390</strong>ms on average (arguably, the server would have scaling issues as it would have to render the HTML for all clients, but still).</p>
<p><strong>2200</strong>ms means about <strong>7</strong>ms per row. The problem here isn&#8217;t that the jQuery code is slow, but rather that it&#8217;s executed so many times to add up to a pretty large number.</p>
<p>My first attempt to improve performance was to avoid constructing rows cell by cell, instead building the final HTML of the row in one shot and then selecting clickable elements inside the row through their class to apply event handlers. Rows were then inserted into the table body using jQuery&#8217;s DOM functions. The new rendering time was <strong>1800</strong>ms, which was not as good as I hoped my improvement to be.</p>
<p>The second step was to move away from selecting clickable elements to apply event handlers. This meant that I could either insert the event handler code in the HTML (but this meant no closures, so I would have to rely on global, non-garbage-collected behavior) or add a click event to the entire table and determine what element had been clicked (and parsing the DOM for information about what to do with the click, which was annoying).</p>
<p>I went with the first way, rewriting my code as global handlers and eliminating all the select-child-with-class overhead. Rows were still constructed independently and inserted independently. The improvement was sensible, as the rendering time was then <strong>980</strong>ms.</p>
<p>The last wave of optimizations consisted in making sure the HTML for the entire table body was generated in one shot and concatenated as an array (using <code>[a,b,c].join('')</code> instead of <code>a+b+c</code>). This creates 5223-element array, concatenated into a string containing 72357 characters, which is then inserted into the table body using jQuery&#8217;s <code>html()</code> function. The entire process, including preliminary processing of the data to be displayed, takes about <strong>160</strong>m (a 13.7× performance increase).</p>
<p>The change was mostly moving from this design pattern:</p>
<pre style="background: #ffffff none repeat scroll 0% 0%; color: #000000; -moz-background-clip: -moz-initial; -moz-background-origin: -moz-initial; -moz-background-inline-policy: -moz-initial; padding-left: 30px;"><span style="color: #000084; font-weight: bold;">function</span> renderRow(data)
{
  $tr = $(<span style="color: #0000ff;">'&lt;tr/&gt;'</span>);

  $(<span style="color: #0000ff;">'&lt;td/&gt;'</span>)
    .addClass(<span style="color: #0000ff;">'name'</span>)
    .append($(<span style="color: #0000ff;">'&lt;a/&gt;'</span>)
      .text(data.name)
      .click(<span style="color: #000084; font-weight: bold;">function</span>(){ frobnicate(data.id); }))
    .appendTo($tr);

  <span style="color: #808080;">// ...</span>

  <span style="color: #000084; font-weight: bold;">return</span> $tr;
}</pre>
<p>To this one:</p>
<pre style="background: #ffffff none repeat scroll 0% 0%; color: #000000; -moz-background-clip: -moz-initial; -moz-background-origin: -moz-initial; -moz-background-inline-policy: -moz-initial; padding-left: 30px;"><span style="color: #000084; font-weight: bold;">function</span> renderRow(data,html)
{
  html.push(
    <span style="color: #0000ff;">'&lt;tr&gt;&lt;td&gt;'</span>,
    <span style="color: #0000ff;">'&lt;a href="javascript:frobnicate('</span>,
    data.id,
    <span style="color: #0000ff;">')"&gt;'</span>,
    esc(data.name),
    <span style="color: #0000ff;">'&lt;/a&gt;&lt;/td&gt;'</span>,
    <span style="color: #808080;">// ...</span>
    <span style="color: #0000ff;">'&lt;/tr&gt;'</span>
  );
}</pre>
<p>Again, this is an extreme situation where page-generation goes way out of hand because a lot of rows are generate—the net benefit, as far as rendering a single row is concerned, is around 6ms. If your page contains only a small number of complex components, you can ignore the performance issues to get the components done, and only optimize if it turns out to be noticeable.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.nicollet.net/2010/02/jquery-build-page/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

