Tag Archive for 'Start-Up'

Consensus and Compromise

Working on a start-up involves many decisions — how features should work, how pages should look like, how advertising should be written… and making decisions is a difficult process when working as a team. Even in the tightest-knit team of two, disagreements happen.

Sometimes, on the less important decisions, you might get out of it by default — A disagrees with B, but doesn’t care enough about the topic to actually do something about it, so B proceeds anyway.

Sometimes, decisions will be dictated by competence. If A disagrees with a proposed solution because of objective technical or legal implications, well, that’s it.

Quite often, though, there is no such solution. This is where one must be aware of the dangerous tendency of consensus-based teams to devolve into compromise-based discussions.

Consensus-based discussions feel like this :

A: I’m going to frobnicate the thingamajig.
B: If you frobnicate the foobar instead, you would get a 10% increase in floogum output.
A: You’re right. Let’s frobnicate the foobar, then.

That is, initially differing opinions evolve through an argumentation process, so that everyone agrees that the final outcome is the best one.

Compromise-based discussions feel like this :

A: I’m going to frobnicate the thingamajig.
B: I would rather that we frobnicate the bazquux instead. Might get better results.
A: Actually, I have a hunch that the thingamajig is the better option.
B: Tell you what, let’s frobnicate 50% of the thingamajig and 50% of the bazquux. This way we have everything covered.
A: All right.

Here, initially differing opinions are settled through a negotiation process, so that everyone’s input is respected, but no one agrees the final outcome is really the best.

Compromise is a bad idea for several reasons.

  • Quite often, going 100% with any reasonable solution is better than investing only 50% into two different solutions. In more general terms, lack of commitment to a single strategy or objective is a dangerous thing to do, and often less effective than commiting to any strategy or objective one might come up with.
  • Compromise is not agreement, it’s negotiation : you give something and get something else in return until everyone agrees that it’s an even trade. People who are good negotiators tend to dictate their terms in such circumstances, and others might feel helpless and useless after a few unbalanced trades. It also leads to « I agreed with your idea yesterday so agree with mine today » or « You’ve already changed my idea enough, stop asking me to change it again, » both of which are content-free sentences that aim to make a decision based on interpersonal history instead of objective analysis.
  • The inability to come to a logical conclusion sometimes happens because there is not enough data available to decide. The first priority in such circumstances should be to actually search for the data (possibly accepting one of the proposed solutions as temporary until the data is collected), not settling for an arbitrary compromise.
  • Building consensus is hard when people have trouble putting their insight into words, no matter how convincing or true that insight might be. I’m objectively right, but I cannot seem to explain to you the reason for it. This can mean going for a compromise today instead of a consensus tomorrow — hasty decisions are seldom good. Always make sure everyone has had enough time to think the decision through, even if it means adjourning it until later.

How to Annoy People

Here is a hypothetical situation :

The kingdom is in trouble, and the King must enact a new law. Of course, such a law will make many people happy for years, but some people will be annoyed by it for a few days. He has two possible choices :

Law A will make 20% of the people happy at the cost of annoying 1%.

Law B will make 90% of the people happy at the cost of annoying 10%.

Which law should the monarch enact ?

This is not the kind of problem that engineer types like me enjoy — no matter how much we try to abstract away the details and build a sound foundation for that decision, some ethics will inevitably seep in.  Is it right to annoy an additional 9% of the people so that 70% more people become happier? Where do we draw the line in term of percentage, and in terms of annoyance — a few days might be fine, but what about a few months or years?

As you might expect, this is not an entirely hypothetical exercise. It is, in fact, the very core of the opt-in versus opt-out debate. Above, law A is opt-in : only 21% of the population knows about its effects, but at least there will be very few complaints ; law B is opt-out : it ensures that 100% of the population knows about it, but it will annoy the 20% who will have to manually opt out.

My start-up hosts discussion forums for associations. Every one of our customers has faced the same decision : should they send an e-mail to their association members telling them « please come and sign up on our forums » or should they import their member directory and let naysayers unsubscribe ?

With the opt-in approach — please come and sign up — the initial e-mail is followed by a small number of sign ups, usually from the more active or dedicated members, and will be ignored by everyone else. If the number of initial adopters does not reach a critical mass, the forums will simply die off and be forgotten by everyone.

With the opt-out approach — import all the members — everyone will be notified every time a new message is posted, as would happen on a mailing-list. A forum message from a friend is a lot more interesting than a « please come and sign up » e-mail and drives more members to connect and participate. The critical mass is reached far easier (and faster!) and the forum becomes an essential part of the community. However, those who did not wish to participate will receive e-mail that is literally unsolicited, and they will complain about it — while the number of active users increases significantly, the number of complaints and unsubscription requests increases even faster. The delicious irony of it all is that the number of complaints is driven up because the communication tool helps those annoyed members find each other and speak up in unison.

Before continuing, let me fend off two possible problems.

First, the opt-out approach is not intended to be a sneaky trick — we always strongly advise customers to send a preliminary e-mail to all their members in advance, telling them about the plan to move to a new discussion system. Not only does it keep things civil and honest, but people who absolutely hate receiving messages from their community can ask to opt out before it is too late. Online communities have trouble blooming when 10% of the messages are complaints about the very existence of the community, so it is in everyone’s interests to keep naysayers out.

And if anything else fails, we can wipe out members from our system on demand.

The second problem will be familiar to readers of Seth Godin — permission. Through the eyes of a permission marketer, to import all  the members without their prior explicit opt-in consent is absolute heresy.

I find this view a bit too extreme. Well, it does make sense when trying to sell things that 99.99% of the people will not care about, such as viagra or cheap hotels in Bangkok. But members actually care about what is going on in their association, and — based on our experience — only a minority of members ever asks to be completely removed from the forums. Most members only unsubscribe from individual discussions that they do not care about, and choose to remain available on the forums as a whole.

It feels sad that so many people would miss out on a great experience just so a handful of curmudgeons can spare the effort of clicking an unsubscribe link.

Back to the point.

As far as I can tell, the approach that helps most members get involved in an online community is :

  1. In advance, send them an e-mail telling them that the association is about to move to another online community system — in our situation, it could be described as a mix between a forum (for those who wish to be very active) and a mailing-list (for the occasional participants) — and that each of them will receive those communications on an opt-out basis.
  2. You will receive messages saying « this is a bad idea » or « I don’t want to receive those communications » and you should take steps to make sure that no person who opted out at this stage is ever imported into the forums. Unless they ask for it later, anyway.
  3. After enough time has passed — at least 48 hours for large associations — import everyone into the forums, and write a welcome message there explicitly asking them to say hello when they reach it for the first time.
  4. If any other people request to be completely unsubscribed, you may simply remove them from your forums and they will stop receiving messages. If you need to absolutely make sure that all their data has been wiped from our database, drop us a line and we’ll take care of it for you.

We have had several customers build a thriving online community for their association with this approach, and have even seen a few « get me out of here » naysayers change their minds and come back online, once they understood everyone else was using it.

Where do you stand on the matter ?

Article image © Carlo Piana — Flickr

Agile Code in OCaml

First, a quick bit of background: I’m working on RunOrg [fr], a start-up that provides communities with their own online private social networks à la facebook. The technology stack is Linux-Apache-CouchDB-OCaml, and this has some implications that I will discuss below.

Facebook has it easy in terms of user management: an user starts existing on their platform the instant they sign up, at which point they fill in their first name and last name, and these are displayed to anyone who is allowed to see any hint of the user’s existence. So, making the first name and last name mandatory are quite acceptable.

At RunOrg, we cannot do this for several reasons:

  • User profiles may be created by communities as part of the membership management toolbox: we have to rely on user A to provide data about user B, and user A usually relies on an email-only source (such as newsletter or mailing-list registrations) where no first name or last name is available.
  • A given user may be part of several independent communities, and may choose to manage their identity separately for each one: appear as John Doe in an innocuous community they trust and John Censored in a more critical community.
  • We also allow users to keep control over whether a community is allowed to publish their name on the internet (as part of the online directory, or as comments on public articles).

Our needs for advanced privacy controls involve a more complex management of both what data is available and how we display it. The good news is, it’s certainly possible to handle all of this elegantly in terms of implementation. The bad news — we didn’t plan everything ahead.

It was only a few days ago that our customer requirements sessions brought up the issue of email-only sources: community managers were frustrated by the fact that our mass import functionality required first names and last names. The problem is, in almost every single programming language out there, making a required field become optional is a very dangerous endeavor because the development team must audit the entire code base to identify which parts of the code assume that the field is required, and describe what should happen when the field is null.

In your average PHP project, try making the user name optional and I can assure you that sentences like «You have been invited by  to this event» will appear. Someone failed to audit the who-invited-you-to-events code. At least with Java or C# you will get a Null Reference Exception of some kind that will show up in the logs and give you the opportunity to hunt down the mistake.

The good news is that our implementation language OCaml, does not allow null values. Instead, optional values are handled using a different value type, known as 'a option, which changes everything. An optional value simply cannot be accessed in the same way as a non-optional value. Trying to do so anyway will cause a type error that is picked up by the compiler, so a programmer can rely on these errors to quickly identify all locations the code that assume the value to be present.

I’ll say it again: in OCaml, a field being optional or mandatory is an assumption that is build into the type of that field, so changing the assumption involves changing the type and breaks all code that does not match the new assumption. Applying breaking changes to an OCaml code base is usually as simple as following a trail of compiler errors.

So, that’s what we did. We already knew that the behavior we wanted was to construct the “display name” of users like this:

  • If either the firstname or the lastname are present, use them (if both are present, use firstname-whitespace-lastname).
  • If none were present, then the private display name (visible only to the user themselves on their profile page and in the e-mails they receive) should be their e-mail address and the public display name (visible to everyone else) should be the username part of their e-mail address (so john.doe@gmail.com is shown as john.doe@…)

First, we defined two functions that compute the private and public display names based on the first name, last name and e-mail of the user. Then, the compiler error trail led us to all locations where a change was required, where we quickly identified whether the public or private name was to be displayed and replaced the existing code with our new display name functions. In total, a full audit of 40kLOC was done in less than an hour and I have proof that any code that uses the user name now handles the case where the user name is not provided.

The Rules

When working on any OCaml project, and especially on RunOrg, I follow these few rules:

  • Any assumption must cause a compiler error when broken. Either the code determines on the spot that the assumption is true, or I use the type system to prove that another part of the code already did. This rule took a massive toll on my early productivity, and I attributed it to an inherent cost of making compiler-enforced assumptions, but the real reason was that I was still pretty new at it — the elementary assumption enforcement from my smaller projects was too crude for the needs of the richness of RunOrg functionality, and it took me six months to refactor my early approach into an elegant and streamlined strategy of encoding assumptions into types.
  • Don’t work around the compiler or cheat with semantics. The initial reaction to a system that complains about every little change you make is to try and work around it by using more generic types or storing information where it does not belong. For instance, an easy solution to the optional name conundrum would have been to store john.doe@… as the name, but doing so would have been semantically incorrect (that’s a placeholder, not a name) and would have polluted the database “name” field with things that are not names and that will be treated differently from names at some point in the future.
  • Don’t accept mediocre code or patterns. Sometimes, design choices in the interface of module A will lead to ugly code in modules B, C and D because an unforeseen usage pattern happens to apply 95% of the time and the interface of module A was not designed with that usage pattern in mind. No amount of cleanup or refactoring in modules B, C and D will solve the problem, the only solution is to go back to the design of module A and change the interface even if it means that two hundred client modules will break. Keeping my code clean, elegant and short is worth wading through two hundred modules.
  • Perform lazy payments on your technical debt. I can propagate new design changes through your entire code base in one coding session, but this doesn’t mean I should. Instead, I keep a mental todo-list of all the changes that need to be applied, and apply all of them at once, locally, whenever I have to rework a given piece of code for any reason. While it may seem that such a todo-list is hard to keep and I will inevitably forget parts of it, remember that those design changes came around in order to solve the problem of ugly or mediocre code — by noticing that the code is ugly, I am reminded of the strategies that I set up in order to clean it up.
  • But be eager with small payments. If it’s a matter of moving a few functions around or refactoring a small piece of code, I do it as soon as I am done writing or rewriting it. Cleaning up little odd bits in a mostly clean code base is extremely rewarding.
  • Discover code by trying changes out. If the assumptions are correctly laid out, then the easiest way to determine the implications of a change — whether it will work and how long it will take — is simply to try it out. Following the compiler error trail will quickly reveal how many things are impacted by the change, as well as any unforeseen massive consequences. If it turns out that the change is too impacting, I just roll back my edits.
  • Keep interface patterns to a minimum. The basic idea behind having few different interfaces implemented by many parts of the system is usually expected to be «code is easier to reuse» but I disagree. Yes, that is a frequent benefit, but certainly not the most essential. Having few different interfaces means that most of my code can be described using a small vocabulary of interface patterns, and that looking at some code immediately reveals the pattern being used there. It also means that any design changes can be expressed in term of pattern changes, and can be applied almost blindly to all locations where that pattern was used. Last but not least, by using a simple shared vocabulary for large sections of the applications, I make it easier to recognize patterns in the more chaotic sections based on how they interact with the cleaner code. It’s easier to determine that two sentences have the same meaning if they share some words.
  • Love your code. In the RunOrg code base, priority 3 is making sure the code is well-designed, clean and free of technical debt, priority 2 is adding new features, priority 1 is making sure there are no bugs, and the drop-everything-you-do-and-work-on-this priority zero is that I should never hate working on the software. Motivation is paramount to keeping the code clean, feature-rich and bug-free, and even to working on the start-up in the first place, so anything that might make me question my dedication to the project or cause me pain while working on it must and will be corrected as soon as possible, regardless of other priorities.

I’m pretty certain that all of the rules are important, but I do believe the last one is an absolute prerequisite.

Article image © Ergonomik — Flickr

Does this repository make me look fat?

The main RunOrg SVN repository contains:

  • 38490 lines of OCaml implementation code spread over 273 *.ml files.
  • 5134 lines of OCaml signature code spread over 156*.mli files.
  • 4776 lines of HTML template code spread over 271 *.htm files.
  • 1971 lines of CSS in 6 files.
  • 1898 lines of JSON configuration in 26 files.
  • 1172 lines of JavaScript in 5 files.

That’s a total 53441 lines in 837 files.

Here’s a plot of how the 38490 lines of OCaml code came into being over the last eight months.

That is all.

Software Patents – Why, Why Not

Intellectual property exists for a reason: protecting creators from the theft of their creations. Not in the sense that the idea is stolen — idea theft would imply that the original owner would not have access to the idea anymore — but rather that the opportunity to monetize that idea is stolen or destroyed by someone with the industrial and commercial firepower to completely take over the market. Georges Méliès created the 1902 motion picture Le Voyage Dans La Lune, which was a special effects masterpiece at the time, but was denied the opportunity to monetize it in the United States because of piracy and eventually went bankrupt. The question asked by intellectual property supporters is, would people invest time and money in creating new motion pictures if they knew the same thing could happen to them? Copyright is intended to ensure that, if someone runs with the author’s work, the author can shut them down or make them pay.

The same reasoning exists behind patents: inventors spend time and money on the arduous process of elaborating and perfecting new machines or processes. Then, he starts selling the invention, and inevitably, other people and companies will notice. People and companies that did not participate in the elaboration, that have a better manufacturing infrastructure, a better sales network, and are not weighed down by the cost of researching the invention in the first place, so they can easily manufacture and sell the invention, outmatch the inventor on the market due to lower prices and larger quantities, and keep the profit to themselves. But if the inventor holds a patent, he can force the rogue manufacturers to pay him for his invention.

Perhaps the single greatest example of modern patents is the pharmaceutical industry. The cost to creating a new drug is tremendous, not because thinking of new molecules is a strenuous activity, but because of the many clinical trials required to turn tens of candidate molecules into a single FDA-approved doctor-prescribed drug. But once the drug exists, competitors can find out its composition at a fraction of the original cost (after all, the composition is published as part of the approval process), manufacture it, and sell it at a price that does not need to pay for the original research and development. Were they allowed to do so, one would expect the pharmaceutical innovation to freeze in a prisoner’s dilemma as stealing the drugs of others is far more profitable than inventing your own.

Patents work in these traditional industries for two reasons:

  • Inventing the patented mechanism or creation is a costly, arduous process, but creating a competing product based on the invention is comparatively cheap. This means that without patents, copying is more profitable than inventing.
  • Two people inventing the exact same thing without hearing from one another is extremely unlikely, either because the inventions are so esoteric and unusual that no two people would reasonably think of the same thing, or because the inventors are part of a community of experts that regularly hear of what the others are working on.

Software patents protect algorithms — processes executed by computers to transform data and interact with humans and other computers. The problem with patenting software is that the two reasons above are the exception rather than the rule.

Creating competing software products is hard. Assuming that Microsoft invents a wonderful new algorithm and includes it in Windows 8 without patenting it, simply reusing that algorithm does not make the creation of a serious Windows 8 competitor any easier. The modern software industry is so complex and multi-faceted that the reuse of any algorithm, no matter how clever or innovative, would not contribute significantly to the creation of a competing product. Copying over a single algorithm does not grant your competitors access to your network of compatible software, your corporate partnerships, your user base, your software development infrastructure, your reputation, your existing maintenance contracts, or any of the many other ways in which the average software company turns an algorithm into money.

And there’s more. The job of software programmers is to create new algorithms, and there are quite a lot of software programmers around. Hundreds of thousands of software programmers. This means that any algorithm which happens to be a slight modification or adaptation of an existing known algorithm, or the result of a deterministic problem-solving strategy, has already been invented at least a dozen times, currently sleep in a legacy project somewhere, and will be re-invented another dozen times in the coming years.

The possibility that someone might steal their work does not deter these programmers from inventing algorithms.

There is no prisoner’s dilemma in software development where everyone waits for the others to invent new things so they can copy them. We have a thriving start-up community bent on inventing new things that almost never patents anything at all. We have an industry that focuses on the quality and nature of the data it collects rather than the ever-changing processes that are applied to that data. The software industry in general does not need patents to keep the innovation ball rolling.

This, in itself, would be no issue — if software companies want to give money to the USPTO for nothing, let them be.

The problem is that software patents are actually hurting innovation. There is a strong tendency for the USPTO to grant patents for algorithms that are not in any way esoteric or unusual, or even new. As a software programmer, I invent new algorithms as part of my daily showering routine ; there is a significant probability that those algorithms are patented either because they are so obvious that a patent-happy company already thought of them, or because they are a special case of one of the surprisingly broad “any computing device, any storage mechanism, any input mechanism” patents being granted recently. The fact that I independently developed that algorithm through logical reasoning, simple modifications of an existing public domain approach, or even genius, does not absolve me from respecting the patent. Sure, I could weasel my way out of the patent by tweaking the algorithm, but doing so is hard for two reasons that are completely unrelated to the underlying technical principles :

  • The only way for me to know what patent I must work around is to look through all existing patents for those that might be applicable to my situation. This means I need to research existing patents as part of my daily shower routine, which is highly inefficient and wastes a lot of water.
  • Even if I was able to identify the applicable patents, I would certainly be able to create algorithms that still did the job without infringement, but I need to hire a lawyer to determine whether there is indeed no infringement.

All of this leads to a massive increase in the cost of developing new software, on the scale of having to check for copyright infringement every time you write an e-mail message. Where software innovation used to be a single engineer working for a few days on a project (which, let’s admit it, is fairly cheap), it now involves much costlier patent lawyers even in the case when no infringement actually happens. Needless to say, most start-ups don’t bother with the lawyers and live at constant risk of discovering that one of the algorithms they created on their own has been patented by someone else.

And all of this happens even when the patent is obviously invalid due to prior art, because it’s usually cheaper to spend a few days working around the patent than to risk going to court over it.

I will not even go into how the monopolies induced by the patent system hurt the users themselves. There have been plenty of discussions on that already.

But there is a flip side to all of this. Patents are not only about inventing new algorithms and processes, they also pay for the cost of validating them. Some algorithms are not solutions to black-and-white, solved-or-not-solved problems where one can prove on the white board that the invention always produces the correct output. Some problems are so complex that any candidate algorithm must be tested against huge amounts of data, and the testing is not cheap. It might be run against data that the inventor has spent a lot of time collecting from the real world, it might be run against data that the inventor had to buy at a steep price, or it might be tested on real-life humans or customers with the risks and costs this involves. And even if the tests themselves are free, there’s the possibility that one needs to try out hundreds of different algorithms or parameters before an acceptable solution is found.

This is almost the same problem as the pharmaceutical industry, where thinking up new molecules is cheap and running the clinical tests is the real cost. The only difference is that algorithm validation is not a mandatory or even obvious process. The various pieces of user interface in Apple, Facebook or Google products might seem utterly obvious when you see them, but each of them passed grueling internal testing processes and survived thrilling tournaments of feature comparisons before ever going live, and these processes and tournaments cost a lot more than the developers thinking about user interfaces during their morning shower.

I suppose the real question that needs to be asked is, why do we have a patent system in the first place? Is it because innovation cannot exist without patents (which is clearly not the case as far as the software industry is concerned) or is it because we somehow feel that pioneers deserve some kind of reward for being the first, even though they were often not the only ones?

Article image © Stuart Heath — Flickr

Pipelines

You’re a company. Your customers give you money in return for something. They usually start out without any knowledge of your existence and end up giving you money, and the paths they follow are one of the many customer acquisition pipelines you have created (perhaps unintentionally) for your company.

They are called pipelines because they usually convey a continuous flow of customers from point A (never heard of you) to point B (gave you money). Each is comprised of segments:

  1. They just heard about you: maybe you sent them an e-mail, called them on their phone or contacted them through a friend. Or maybe they did a Google search for your product and you were on the results page. Or they saw an ad for your product somewhere. Or your product was recommended by someone they trust. Whatever the reason, they are now aware of your existence. Some of these people will move on to the next step, which is to…
  2. Learn more about you: most of the time, they will visit your web site and sometimes look for references online or in their social circles. Sometimes, they call you, or have you call them, or even ask for a meeting and quick presentation. Either way, some of these contacts will decide that your company or product is interesting enough to…
  3. Try out your product: this could be a free limited-time trial, a demo version, an on-site pilot, a trailer video, a sample tray in a shop, or any other form of discovery that lets them actually experience your product without any serious commitment on their part. You can get away without a demo version if your product is either cheap enough that buying it once does not count as a significant commitment (such as buying a candy) or popular enough that it can convince customers on reputation alone (such as Half-Life 3). Anyway, based on the trial or the reputation, they will decide whether they…
  4. Become customers: this is the interesting part, where you get money in return for whatever you provide.

The number of people at each stage in the pipeline decreases: not all people who hear about you will even visit your website, not all people who visit your website will download the demo, and not all people who download the demo will buy the full version. Since having more people at the bottom of the pipeline is a good idea, a lot of efforts will go into making the pipeline more efficient — losing less people along the way. The customer conversion rate represents the percentage of customers that go from point A to point B, and you usually want the CR to be as high as possible.

In practice, there are two other important things to keep in mind : the customer acquisition cost is how much money you spend on a given customer (on average) to get them through the pipeline. If your pipeline costs you $200,000 in advertisement, phone bills, salesman wages and bonuses, to turn 1,000,000 people into 1,000 customers, then your CAC for that pipeline is $200/customer. The customer lifetime value is how much you can expect to earn from that customer over time, after subtracting the cost of the products you sold. If a customer pays $5 for a book that cost you $2 to manufacture, and never comes back, their CLV is $3.

What matters, then, is not the customer conversion rate, but your net profit per customer : CLV – CAC. A pipeline with a CAC of $200 that leads to a book-buying CLV of $3 is a pipeline that wastes $197 per customer in marketing-and sales-related costs: you spent $200,000 to have 1,000 customers buy $5000 worth of books (of which you only get to keep $3000) for a net loss of $197,000 ! In short, your job is making sure that the CLV – CAC difference is as high as possible in all your pipelines.

The reason why increasing the CR is a good idea is that a higher rate means more people come out of the pipeline, so the total pipeline cost is divided among all of them, and thus the CAC decreases. If I were to double the conversion rate on the above $200,000 pipeline, I would get 2,000 customers and the CAC would decrease to $100. Usually, a higher CR means lower CAC, but this breaks down when increasing the CR costs money: if you paid an additional $300,000 to get the better advertisements and salesmen that allow you to jump to 2,000 customers, then the CAC would actually increase to $250 ! Similarly, lowering the price usually means more people will become customers, which increases the CR and decreases the CAC — but it will also have an impact on the CLV, since you are now earning less money on every customer. If the price drops by $10 and the cost drops by $7, you just lost $3. Per customer.

Optimizing Pipelines Is Hard

The entire issue of sales and marketing can be summed up in a single short sentence:

You have to tweak the customer pipeline, which costs money, and you have no idea whether the conversion rate will improve enough to pay for the changes.

Usually, people have reasons to leave the pipeline other than the price. By finding those reasons and addressing the underlying issues, you increase the conversion rate. So, your course of action is:

  1. Find out why people are leaving the pipelines.
  2. Find out how you could eliminate or mitigate that issue.
  3. Estimate how much it would cost, how the conversion rate would evolve, and the resulting acquisition cost.
  4. Pick the solution that improves the acquisition cost the most, and implement it.
  5. Go to step 1.

Steps 1 and 3 are the difficult ones.

Step 1 is difficult because you need to determine why people who are not your customers don’t buy your product. Do you have an on-site satisfaction poll ? Do you send them a quick e-mail ? Do you have their contact information at all ? Do you even know what your conversion rate is ?

Step 3 is difficult because you actually need to estimate how many people would buy based on that one change. This is even harder, because people can always find new reasons not to buy, so that number will probably be smaller than your estimates from step 1.

These steps usually turn into the easier sequence known as A/B testing:

  1. Guess, or pay an expert to guess, why the people might be leaving the pipelines.
  2. Implement a solution that addresses the issue.
  3. Compare the new (B) and the old (A) conversion rates.
  4. If the solution actually improved profits, keep it !
  5. Go to step 1.

This solution does work, but it has two pretty heavy limitations. The first is that the solution from step 2 needs to be fairly cheap — it would be insane to implement a new major feature only to discover that no one cares about paperclip-shaped assistants. So, if the actual reason people are not buying is that a critical feature is missing, you will not find out through A/B testing.

The second limitation is that A/B testing has high fixed costs, because you need to actually change page layouts and shopping cart workflows for every test. These costs are one-shot, so they will be spread over all the customers coming out of the new pipeline — spending $1000 to get from 10,000 customers to 15,000 customers is a bargain, spending that same $1000 to get from 10 to 15 is not. So, unless your existing pipeline already has a fairly good throughput, A/B testing solutions to significant issues might just be too costly.

Article image © Brian Cantoni — Flickr

Objective Caml Web Programming

The core RunOrg¹ application clocks in at about 30K lines of Objective Caml code, with around 2K being added every week. If you factor in our use of CouchDB, all of this might strike you as an odd choice of technologies, based on esoteric hopeful fantasies instead of cold pragmatical consideration. It isn’t, despite what others might say:

OCaml: You know yourself to be fast, smart, and extremely reliable. However, you look kind of funny and nobody really wants to talk to you. You spend most of your time sitting in a public library glaring at people, occasionally yelling “NOBODY HERE APPRECIATES MY GENIUS!” and getting kicked out.

Two years ago, I discussed the topic of using Objective Caml for web programming:

What would happen if a compact web framework were proposed? One that, in addition to borrowing existing useful concepts from other languages, also added some OCaml-specific features to the mix. Functional modules would be an interesting addition, so would be the type system and pure functional programming applied to transactions, and monadic optimization at initialization time would also be quite interesting.

Eliom

Let’s get this out of the way first. I have been continuously peeking at Ocsigen – Eliom (a web server and assorted web framework) ever since it was mentioned in a comment, and some aspects of it resonated with me while others really did not. In many ways, it served as a showcase of the many ways in which the peculiarities of Objective Caml can impact the development of a web project, and helped me decide whether these were appropriate or not. This evolved into my own rendition of a web framework, Ozone, connected to an apache server through OcamlNet2-powered FastCGI.

There were many reasons for avoiding Ocsigen – Eliom, though I do not believe any of them to be universally true. The main reason was described in Guillaume Yziquel’s comment on that article:

Somehow, even a Ruby on Rails app is a state machine. Perhaps a “better state machine”, but a state machine nonetheless, in the sense that incoming requests interact with each other by modifying the internal data.

With Ocsigen / Eliom, it’s completely different: it’s a “safely” multithreaded, compiled, application. And that makes all the difference.

Based on my experience with Ocsigen – Eliom, I fully agree with this assertion, but consider it a liability in my situation. Our business plans call for a number of users that cannot be safely expected to all run out of a single server, be it multi-threaded, for both scaling and redundancy reasons. At some point, the only communication bridge between two requests will be the database back-end, and I need my web framework to accept that and actually make sure that my one-server code will gracefully scale up to a multi-server setup.

On a more philosophical level, I agree that «On [the] server side, somehow, the “state machine” paradigm has been a hindrance», but HTTP being what it is this is a basic truth that will not go away. Eliom is building an abstraction on top of it that will continuously spring leaks whenever the disconnected nature of HTTP surfaces. This is what ASP.NET and countless other technologies tried to do and they have all made the fall back to HTTP harder when the situation did eventually ask for it.

Ozone is also a compiled application, but it has one thread and no sessions — scaling happens by launching more instances of the application and therefore supports transparently the addition or removal of servers, while “session data” is stored in a combination of client-side state, database storage and HMAC proof tokens in the URLs. While this ascetic approach cuts me off from the sheer sexy of what Eliom allows, the tradeoff is a fairly convenient set of scalability guarantees. But if you can afford all that Eliom sexy, then I have no issue with that.

Benefits of OCaml

This is why I use Objective Caml, in no particular order.

  • It’s fast out of the box — OCaml is on par with C performance as long as you don’t stray too far into sub-optimal areas (such as naive string concatenation). I can write any kind of code and be assured that it will not be the bottleneck, because database access and HTTP are a lot slower: right now, the average HTTP request takes about 80ms, with about 60ms for the actual HTTP transfer, 18ms for database latency, and 2ms for all of the Apache-FastCGI-Ozone sequence when compiled without optimizations.
  • It’s a compiled application. This one is mostly aimed at my PHP friends, where every request starts a new PHP execution from scratch — this makes it several designs impossible or impractical, such as event-based programming: this would require B to register as a listener to A’s event, which means B should be identified as a potential listener and loaded for every request even if it does not trigger the event. Once initialized, a given Ozone instance can respond to tens of thousands of requests, which makes it worthwhile to run a lot of pre-processing and pre-caching operations during initialization.
  • It’s safe. I use a programming style that relies on avoiding exceptions, never using wildcards, defining many new types for almost everything, and writing pure functional code. This eliminates entire realms of bugs : using the wrong variable, forgetting to call a function or catch an exception, being surprised by a sneaky side-effect or doing things in the wrong order… About half the bugs I caught using Unit Tests don’t exist in OCaml (null reference exceptions, anyone?) and the other half is eliminated by my programming style — so I don’t write unit tests anymore (well, I do write an automated “test” every time I find a bug, but it’s usually as simple as adding a type annotation). This also lets me routinely refactor literally half the application every other week, without causing any bugs.
  • It’s concise. Most of the features I write are a matter of a mere hundred lines — most of the code is related to my obsessive need for being explicit. Being a functional language, you can define a brand new anonymous function on the spot and throw it into another function that is returned by yet another function which is then given to yet yet another function, all of it being implicitly type-checked without having to define a single IAcceptsBoxObserver interface or LeafBoxObserver implementation.
  • It has a fast compiler. Building those 30KLOC from scratch takes less than a minute — the average incremental build takes one or two seconds. Whenever I have any doubts about what I’m writing, I can just ask the compiler — Hey, did I forget anything about this function call? Why yes, master, you forgot to check that the user was indeed allowed to reply to that message.

The most essential feature is complete compile-time safety. As a web programmer, I have to be careful about hundreds of small details — can this text be translated into another language? Is this user allowed to do what they just did? Did that object disappear from the database while you were editing it? Does that URL really correspond to an actual page? Did you remember to check for script injection in that piece of HTML? Is this GET parameter available at this point in the code? Is this object available or locked by another user? Did I forget anything else? It’s impossible for a human brain to think about all these things while at the same time creating an elegant design or refactoring a piece of code or writing a new feature. I can use the flexible OCaml type system to check for all these details through appropriate design of the Ozone API, which turns the development process into a game of 1° write the simplest code that works, 2° listen to the compiler’s suggestions for making it fail-proof. It’s a game that I’m becoming fairly fond of, and it lets me concentrate on the very core of what I’m trying to do.

Disadvantages of OCaml

It’s not a happy fun place. Quite the contrary: the language comes with a set of annoying quirks and flaws that do make things harder. Before you jump in, you should know what to expect.

  • Type-safety has a price. If the type system cannot express a certain thing, then you can’t do it. There are a few fairly complex examples where this has caused me trouble, in areas such as optional function arguments, module meta-programming, JSON serialization or dynamic database-driven data structures. Workarounds exist, but they’re only workarounds. Another side-effect is that type inference can make it hard for inexperienced developers to find an error, especially if you do a lot of strange type wizardry. Not to mention the silly yet annoying “this expression has type foo but is used here with type foo” error.
  • Lack of tools and libraries. Being a non-mainstream language means there are no heavily tweaked and highly evolved tools available (think about the wealth of tools available for C# or Java development), which gives a certain clunky feel to development. Besides, many libraries which are taken for granted in the mainstream world are missing or non-documented — try connecting to the Facebook API and you’ll notice that not only there is no Facebook SDK in OCaml, but there is also no documented way of using HTTPS. The same goes for Amazon S3 and MD5-based HMACs, by the way. And iconv functionality. And removing the X-Mailer header from e-mail you send. The list goes on.
  • It’s not object-oriented. You can use classes and mutable objects — it’s a viable implementation strategy, but it also bears a lot of the typical issues encountered in the mainstream programming languages, and it lacks the conciseness of functional approaches (defining a class and instantiating an object is bound to be longer than a lambda). If you’re not in the right mindset for using the language, you will miss on a lot of the benefits.
  • It’s not popular. It is a disadvantage, just not a technical one. As a programmer I couldn’t care less about the popularity of my language because, you know, COBOL was very popular once. As a hiring manager, I am aware that using a non-popular language will make hiring developers harder. As a start-up founder, I know that this reduces my chances of selling my company because esoteric technologies are a risk to potential buyers.

There are also many tiny quirks in the language that I hope would eventually be solved. For instance, there’s the absence of a shorthand notation for the ubiquitous (fun x -> x # member). There’s also the lack of C#-like properties, with a pure functional twist:

val x = init

method get_x    = x
method set_x x' = {< x = x' >}

And, of course, there is a lot of things going on with the option type that BatOption just isn’t up to expressing concisely. The P4 preprocessor could be applied to these situations fairly reasonably, but I would feel more comfortable if they were built into the language (and syntax highlighting tools).

In conclusion, OCaml + CouchDB provide our team with the flexibility required to build new features frequently without being afraid of subtle bugs or regressions, and to regularly refactor our code into a more amenable mess. It is a level of compiler-provided safety, surgical refactoring and bug detection that would be simply unavailable with C# and Java (and hopeless with PHP, Python or Ruby).

¹ RunOrg is my Start-Up ; we provide an online tool that helps associations, unions, organizations and communities manage their members, contacts, activities, events, knowledge and online presence.

Behind Blue Eyes

Oh, my. Look at the time. It is almost 1:00 am here. I should already be sleeping by now.

She is already asleep. Which is quite surprising given that she needs less sleep than I do, and we usually wake up at about the same time in the morning. Of course, she commutes to work while I work from home, which might make it seem a bit frustrating.

She trusts me. And that’s rational trust in my ability to cope and either succeed or cut my losses, not the watered down «Of course I trust you, honey» version.

And she supports me. All the way through. She understands that I sometimes need to work late or week-ends, that I can spend a lot of time with my associates or customers, and that I sometimes leave the house in a mess because I was too damn busy battering a feature into an appropriate shape. And she deals quite gracefully with my permanent obsession about my start-up.

And I certainly do not thank her for this enough. I do not think I ever could.

 

Wait. She is not actually sleeping. The dim glow of a laptop and soft chattering of a keyboard are coming from the bedroom. She is almost as much of a geek as I am, which might be one reason she copes with my… tendencies.

Maybe I should get some sleep.

Yes.

Time to go.

Thank you.

 

Don’t Push – A Small Review of Cache Strategies

The standard behavior of most cache system follows these steps:

  • Attempt to read needed data from the cache
  • If data is missing, compute it and place it in the cache
  • Return the data

This is a fairly streamlined process that’s easy to add to almost any single algorithm that constructs data. The cache could be local (it’s part of the application, or even of the current function call), it could be dedicated (memcached), or the data might be persisted back to the database (such as adding the number of files in a given folder to the folder object itself instead of counting them every time).

The root of this strategy is the principle of memoization: if a function is pure — that is, calling it with the same arguments twice will return the sale result twice — then you can place such a cache in front of that function so that it will only be called once for every argument.

Memoization obviously found its way into RunOrg, because it’s literally a one-word optimization hint that trades memory for performance where it matters. In practice, in a web application like RunOrg, the only really costly computation is sending requests to the database, which is by definition not pure. Still, I can usually expect that for the duration of a single HTTP request, the database contents will remain reasonably stable, so I can create a temporary memoization cache when the request starts and drop it when the response is sent. Actually, I’m using a slight variation on standard memoization which is batch memoization: in order to improve performance, queries for objects A, B, C and D are represented as a single batch query to the database asking for the list of four objects. With standard memoization, if I asked for objects A, B and C, then for objects B, C and D, then six objects in total would be requested (because those two lists are different). By extending the memoization algorithm to know that list elements are independent, I can have the second query ask for only D, and retrieve values for B and C based on the first request.

Outside of this situation, however, functions can hardly be considered pure, so special steps must be taken to keep the cache up to date. This results in three common strategies.

The first one, which is the cache expiration strategy, is to give up on data freshness and decide that data in the cache will survive for a fixed duration regardless of whether the actual data has changed or not — so, instead of declaring that data is always up to date, it declares that any change will be visible everywhere in less than X seconds. While somewhat weak, this strategy is particularly effective because it does not need any kind of knowledge of the relationship between the cache and the underlying persistent data — the only connection between the two is the compute-if-missing steps outlined above.

Once you decide to handle the relationship seriously, two more strategies become available. The cache invalidation strategy is activated when the underlying data is changed, and invalidates all cache items that are dependent on that data. Thus, subsequent requests for those items will trigger a cache re-computation and always serve fresh data. Of course, this means that the cache system can easily tell which items should be invalidated. This is fairly easy in a one-to-one data-to-cache mapping, but as pieces of data can be mixed and matches into various cache items, this requires an unusually complex architecture to handle.

A nice design trick to keep in mind is that you don’t always need to find all that data — sometimes, you can simply «lose» it: for instance, web caching uses a combination of expiration and invalidation strategies. When a file  is sent to a browser through HTTP, it sometimes carries a header explaining when it expires. This is useful when all the pages on your web site use the same CSS or JavaScript files, because then your visitors will only need to download them once and will use the cached versions from the on. To handle changes in CSS and JavaScript files, some web sites rely on cache expiration (a few minutes or hours of cache lifetime, so that any changes are detected soon enough) while others use cache invalidation. Obviously, the web site can’t go and notify every single browser that the CSS files have changed (especially since some browsers are closed or offline) so it will simply lose the data: while the CSS file named style.css?0001 will remain in cache for up to a month, the pages on the site are now asking for style.css?0002.

The third, the cache refresh strategy, is a variant on the former: instead of merely invalidating the cached data, this strategy computes the new data and places it directly into the cache. This is necessary when the data is frequently accessed and the computation is long: if one hundred users come asking for the data while it’s invalidated, then all of them will compute the data as part of the “if missing, compute” step of the caching process, which will probably bring the server to its knees — what people call a stampede — so the only safe thing to do here is to keep data in the cache at all times and replace it with a more recent value whenever necessary.

Another nice design trick is to use a flexible expiration date to turn a cache invalidation strategy into a cache refresh strategy: instead of invalidating the item by removing it from the cache, merely set its expiration date somewhere in the past. Then, to avoid stampedes, whenever an user detects that the cache is expired, they set the expiration date to sometime in the future and start computing the data. So, the first reader will notice a delay (as his request will reconstruct the cache), subsequent readers will instantly read the old cached version without triggering a stampede, until the first request ends and the cache contains the new version of the item. To choose between the two: if the event that triggers the refresh provides you with data that improves the time required to update the cache, then update it, otherwise merely invalidate it and rely on your stampede protection to do the job.

There’s one last situation that makes caching complicated, which I’ve recently had to handle with the RunOrg application — indexing. Suppose you have a huge amount of data that you need to wade through, nicely split into separate objects that each have their own responsibility (my profile, my membership information, my participation to event X, my answers to poll Y…) but you sometimes need to virtually aggregate all that data and traverse it to retrieve only certain parts of it: give me the name, premium-or-not-premium membership status and answer to “T-Shirt Size” in poll Y, sorted by the registration date to event X. Yes, that’s one of the many queries that RunOrg lets you do (and you can even print out that data to serve as a list of participants). Now, trust me, there’s no sane way to dynamically run queries of the sort on a clean normalized database and still get reasonable performance. So, you need to create an alternate, denormalized representation of all that data and keep it in cache to avoid re-computing it for every request.

The problem with such a cache is that you cannot afford to re-construct a thousand lines of cached data because one user changed their T-Shirt Size answer, so there can be no high-level validity check. Basically, if you can’t trust the cache to be up to date when you run your read query, you lose. The traditional “try to read, if invalid update” approach to caching goes straight out the window. You need a solid cache refresh implementation that pushes the most recent data into the cache as soon as it becomes available.

RunOrg uses a variant — the cache pull strategy. This is merely a small semantic shift, but it’s quite helpful: in the original cache refresh situation, the data model needs to be aware of the cache, because it must actively send data to the cache whenever a change happens. With the RunOrg variant, the data model merely publishes a “data was updated” event that the cache module may listen to and react by refreshing its contents. So, the knowledge of how to extract data from the model and place it in the cache now belongs to the cache module instead of being spread over both the data model and the cache module. This not only makes the code cleaner — the data model becomes cache-independent and thus easier to read through, with cache modules being tacked onto it using the event system — but it also lets the cache react to events from different data models: an item might be updated when the profile changes, or the membership information changes, or the event X participation changes, or the answer to poll Y changes… and will have to read data from all four to compute the new value anyway.

Obviously, the cache pull strategy is a more complex architecture than the previous one:

  • You need an event system — the entire contraption hinges on the fact that a cache module can listen to the changes that happen in a data module.
  • Your cache module must track the dependencies of each item, in order to update that item when it receives a change event for one of its dependencies
  • You need asynchronous processing, as pulling values for dozens of items simply cannot be done as part of the standard HTTP response cycle
  • You need to follow clean multi-process patterns to handle simultaneous updates of some items

Still, given the performance we achieve with this approach, and the clean code that results from the underlying events-and-async structure, the results are certainly worth the efforts.

RunOrg is my Start-Up ; we provide an online tool that helps associations, unions, organizations and communities manage their members, contacts, activities, events, knowledge and online presence.

Work ≠ Progress

I did a lot of work today. Mostly, I tracked down and eliminated a nasty little problem related to our @runorg.com email addresses and our DNS records.

DNS is the directory system which determines which particular computer handles the requests to a given domain name. So, if you’re looking for holy-grail.runorg.com, a DNS entry mentions that it points to the machine known on the internet as 188.165.231.88, which happens to be our main production server.

The MX records are used when you’re looking for the mailboxes for that domain. This is because usually, you don’t want your web server to handle your e-mail: it’s handled elsewhere, such as another company server, or maybe gmail. So, you can specify a main DNS entry for your domain and then use the MX record to point to another server specifically for e-mail.

Finally, the CNAME records represent the canonical name. We don’t want our main web site to be available both on http://runorg.com and http://www.runorg.com, because it’s confusing and bad for the search engine ranking. So, I pointed a CNAME telling that runorg.com should point at www.runorg.com.

What I did not take into account (or even know) was that CNAME records are meant to be of a higher priority than MX records. So, when someone sent an e-mail to foobar@runorg.com, it would undergo canonicalization and point at foobar@www.runorg.com instead. Since there was no MX record for the latter, the e-mail would then disappear into the void. Our tools and newsletters apparently ignored the CNAME when sending e-mail, so we received those correctly.

So, my entire day was spent hunting down an obscure, unpredictable and not-quite-documented error in my DNS records. It was necessary work and it certainly kept me busy, but it wasn’t progress.

Our team has a looming deadline: the delivery of our first version of the software. It’s when we move from an “implement all the stuff we need before we can deliver” strategy to a “improve or add features to the existing product” strategy (which is an entirely different mechanism). Progress is what brings us closer to that transition — while dealing with the DNS issue was necessary, it did not move me an inch closer to delivering version 1.0.

What is the single largest difference between working as an employee for another firm and working on your own Start-Up? Before I started, I would have guessed it would be the work hours (I now work week-ends quite often), the commute (I work at home because we’re too small to need offices), the freedom (I’m literally by own boss) or the lack of money (no comment). Now it’s pretty obvious that the single greatest difference is that I now emphasize progress more than I emphasize work.

In my previous jobs, there was a fixed set of objectives which had to be accomplished, so I would just come to work every day and chip away at the monolith of work to be done, and since it all had to be done anyway, I could do it in any order I wished. Since I’ve started working on my Start-Up, I find myself increasingly questioning the very objectives I’m trying to accomplish — is this going to let me ship sooner, or not? The freedom of choosing (and discarding) my  objectives myself comes with the responsibility of making the right choices.

That’s a question I never asked myself before.

When you think about it, there are many things that are work but not progress. Some are done because it feels easier to do them sooner rather than later. Others are done because, let’s face it, sometimes you have low morale and a neat exciting feature comes up that you’d rather implement even though it’s purely gratuitous (I added a CSV export feature recently that is not necessary in any way, and I know my definition of exciting is weird but bear with me). Others stem from the necessary shame of delivering a half-baked product, but bear in mind that:

If you are not embarrassed by the first version of your product, you’ve launched too late.
- Reid Hoffman, LinkedIn founder

Delivering a huge product with a small under-funded team is ultimately a find-the-shortest-path endeavour. Choose your next objective based on that.



1170 feed subscribers
(readers who polled a feed this week)