Monthly Archive for August, 2011

Romania – Days 10 & 11

Where we slept in Sibiu.

As you might remember, we spent two nights at the Cocosul Rosu in Sibiu. On the whole, the stay was pretty pleasant, although the lack of a host-provided breakfast had us walk all the way to the town squares in search of a restaurant that would be open early enough to accomodate us (our final choice is Timi’s, in the southwest corner of the Piata Mare). Also, do not under any circumstances open the window past nightfall, or you will let in an entire menagerie of bloodsucking insects.

On day 10, we started with a quick visit to the Brukenthal museum of art. The exhibits were correct bordering on remarkable, but the maze-like layout of the museum (as with, it seems, all museums we’ve visited in Romania so far) is extremely frustrating. Indeed, navigating involves a fair bit of backtracking, and there are often several forks both within exhibitions and between exhibitions where you will require the assistance of a museum employee, because there are no useful signs or written hints available.

Brukenthal Museum of Art

Our next stop was Brașov. As an interesting bit of trivia, the old name of the city was Corona, and so the modern beer brand Corona lavishly sponsors the city pubs, bars and restaurant terraces.

Near the city is the pass of Bran, overlooked by the castle of Bran, the kind of castle Bram Stoker would have chosen for count Dracula to live in. The castle was renovated, its walls were painted white, and flowers were planted everywhere, so whatever eerie gothic aspect Stoker could have seen there is probably gone, but the peddlers of vampire-themed trinkets right outside the castle doors are there to stay.

Entrance to Bran castle.

A room inside Bran castle.

Secret door in the Bran castle library.

Empty corridor inside Bran castle.

Inside the Bran castle courtyard.

Bran castle is by all definitions a tourist trap. The roads that lead to the city are lined for miles with housing and restaurant advertising, and the city itself is drowned in far more traffic than its single road can allow. Everything here reaches prices beyond anything else in the country. The ice cream parlour in front of the castle serves ice cream that would be costly in a posh Parisian restaurant, though I have no reason to assume that it is actually worth it.

I am lucky to be two meters tall and be able to see above the heads of the hundreds of tourists navigating the (again, maze-like and frustrating) corridors of the castle, although the low ceiling in many of the rooms could prove quite unforgiving if you ever forget it’s there.

Our bed & breakfast in Brașov.

On day 11, we visited Brașov itself. Before I post pictures, here’s some information about the layout of the city: most of the interesting pieces are in a valley in the Carpathes, with the south edge of the valley holding a cable car station and the name “BRASOV” written on the top, Hollywood-style, and the north edge of the valley holding two fortified towers (the white tower and the black tower). Inside the valley is the Council Square (Piata Sfatului) and the Black Church.

The cable car and southern ridge, seen from the Council Square.

Our first trip of the day involved the cable car. The view was so gorgeous, we forgot to take good pictures of it.

Looking down through the letter B.

After we came back down, we visited the white and black towers.

White tower seen from the southern ridge.

Black tower.

Both towers contain museums. The Black tower museum was opened every day 9-6, and we arrived at around 1 pm, so the museum was obviously closed. I swear, all the museums we try to see are closed for unspecified or ridiculous reasons when we get there.

The White tower museum was open, and therefore much more amusing.

First, let it be known that the top floor of the White tower is the best sauna in the entire city, and that is probably unintentional. The top floor of the tower has a dark wooden floor and no ceiling (it burned down in the last city-wide fire) and so it is capped with an array of glass panels that turn it into a glasshouse of sorts. You can feel the heat wrapping from your scalp down to your toes as you climb the last flight of stairs, and five minutes later you can smell yourself roasting.

I Wish.

The public employee in the White tower is responsible for selling tickets, but also has a small shop of various souvenirs, trinkets, postcards, audio CDs, pamphlets and other things that she will try to sell at any occasion, with unerring insistence. She offered me, in no more than two minutes’ time, a presentation pamphlet of the White tower in French (which was actually Spanish) for 2 lei, an unfolding image book of the tower for 7 lei, two postcards for 3 lei, an audio CD about the region for 15 lei, and a small keychain trinket for 5 lei that I did not identify.

I wish I could sell my start-up product like that.

Mountain river below the White tower.

The Black Church.

The city is also home to the Black Church, so named because it was covered in soot after the aforementioned city-wide fire. The church contains several organs collected from surrounding churches in addition to its own beautiful 4000-pipe organ. We went to a concert in the evening, involving some quite elaborate Bach and Liszt pieces.

Romania – Days 8 & 9

On day 8, we drove from Târgu Jiu to Sibiu. Alix was driving for the first half, and I caught this on the side of the road:

A Stork

We stopped for what we expected to be a quick visit at the Hurez monastery near Horezu. There were many people, as August 15th is a religious celebration. While we were visiting, three ladies came up to me, handed me a camera and asked me to take a picture of them. I obliged. We met again near the cloister, where the sisters were giving out some sweets and wine, and the three of them invited us to have lunch with them in the monastery. We agreed.

Lunch was served in the two communal dining rooms of the monastery. We ate in the smaller of the two along with maybe a dozen other people, who were all quite interested in hearing about my Romanian roots and a comparison with French culture. The food was simple: vegetable ciorba, exceptionally tender veal with mashed potatoes, and watermelon.

We were then invited to visit the private areas of the monastery — as one of the three women knew the mother superior quite well — and we enjoyed some coffee, sweets and some slices of delicious telemea cow cheese produced at the monastery, while enjoying this view:

Looking south

We were also allowed to visit the bell tower:

Looking east

And we were also shown the private chapel where the Brâncoveanu family used to pray, when they lived in the monastery, as well as one of the first printed bibles written in Romanian.

We took our leave two hours and a half later than we had expected to, but with a better lunch that we could have hoped to find in Râmnicu Vîlcea. Still, this day brought us more than just free food: knowing that people are willingly and selflessly sharing with others, expecting nothing more than a few words in return; and that the best experience is not one you can buy.

Inside the Hurez monastery

The rest of the day was spent driving up the valley of the Olt river and reaching Sibiu.

Had this on my right for around 50 km

The Bridge of Lies, in Sibiu

We are spending two nights at the Cocosul Rosu bed and breakfast (I kid you not, this means the Red Cock Rooster) before going to Brasov.

Sibiu is a Romanian city with a strong german-speaking community, beautiful medieval architecture and a much better shape than other cities we have visited so far. If I had to pick, I’d rather have Sibiu than Bucharest.

On day nine, we visited the city. The Brukenthal museum of arts was closed, but should be open again tomorrow. Instead, we visited the Brukenthal museum of history (which had a quite enjoyable exhibition about the city guilds), then toured the city under a blistering heat and unforgiving sun.

The other name of Sibiu is Hermannstadt

East view from the clock tower

West view from the clock tower

Paper model of the (old parts of the) city

A street behind the fortified wall

A small medieval street

The orthodox church

Romania – Days 6 & 7

First, the eye candy. This is how day seven ended:

Coloana Infinitului

Coloana Infinitului

On day 6, we started from Drobeta Turnu Severin and rented a car up to Baile Herculane, nested in a valley in the mountains. There, we had a late lunch at the Hotel Ferdinand restaurant, which is beautifully laid out as a series of terraces on a fairly steep slope.

Hercules

Hercules

The hot mountain springs at Baile Herculane were already known by romans (hence the name), and are still active. We tried them out.

Left: cold. Right: hot.

Then, we visited the nearby Iron Gates, a beautiful gorge on the Danube river that separates Romania from Serbia. While driving along the Danube, I actually received a welcome SMS from the Serbian branch of my mobile provider…

Serbian side of the gorge.

The penultimate narrowing on the Danube.

Sadly, the Drobeta Turnu Severin museum was closed, which included the last standing parts of Emperor Traian’s bridge (the one Romans used to bring over enough troops to seize control over Dacia).

Closed for renovation.

Day seven started in Tîrgu Jiu, northeast of Drobeta Turnu Severin. We are staying at the Hotel Restaurant Europa, which happens to have a great restaurant with a great lemonade.

Demand to see life's manager!

We drove west from the city to explore Constantin Brâncuş’s home town of Hobiţa.

Museum Sign

Wooden church erected by Constantin's grandfather.

We pushed further west to Baia de Aramă, visiting the Tismana monastery along the way, where we were caught by rain in a forested valley.

Sun + Rain + Forest

Back in Târgu Jiu, we visited the scluptural ensemble by Constantin Brâncuş dedicated to the solders who died in the Great War. It is composed of the Table of Silence (twelve stone chairs around a large round stone table) followed closely by the Gate of the Kiss (a large stone gate), then the Path of Heroes (an empty path, about 1300 m long) and finally the Column of the Infinite.

The artist designed the Path of Heroes so that a visitor sitting at the Table of Silence and looking through the Gate of the Kiss would see the Column of the Infinite in the distance. This was planned in 1935-1938. However, in 1937, king Carol II authorized the building of an orthodox church in the middle of the Path of Heroes, completely breaking the perspective.

The picture at the top of the article is one of the Column of the Infinite.

My Hat of the Infinite

Romania – Days 4 & 5

Today is our last day in Bucharest this week. The weather was sunny again, so we visited outdoor locations.

Hipster graffiti on the National History Museum.

Vlad the Impaler

Grave of Mihail Eminescu

Grave inscription: 'You. I was what you are. You will be what I am.'

Carol I gardens

This annoys me to no end.

Quite scary. Does this really help sell kids' clothing?

We dined at the Caru cu Bere

The Caru cu Bere is a beautiful location. It’s also extremely loud and filled with cigarette smoke, and the food is not significantly better than other restaurants we sampled. As its name implies, it probably serves good beer, but I did not try.

Night train to Timisoara

The train left us off at the Timisoara station at 7:25 this morning. The trip was quite pleasent, albeit a bit cold.

Timisoara, Victory Square

Our older friend.

Timisoara, Union Square

Traffic lights have timers here. This is pure unadulterated genius.

Blue Dragonfly

Green Dragonfly

We actually witnessed a blue dragonfly saving another blue dragonfly from a green one.

The lawn is absolutely great here.

Staying at the Hotel Central

Romania – Day 3

We cannot seem to find tea — the local “Ceai” always seems to be some kind of fruit- or leaf-based infusion without actual tea, even though it is labeled as “Thé” or “Tea”. I have resorted to Pepsi for my morning caffeine.

We visited the Liceul Mihai Viteazul.

Mihai Viteazul, again.

We also visited the Palace of the Parliament - unofficially 'Casa Poporului'.

National Statistics Institute.

On the wall of the National Statistics Institute.

Dâmbovița, flowing through Bucharest.

National History Museum, beautiful pieces but dreadful setup.

We dined at the Monte Carlo in the Cişmigiu gardens. The food is absolutely excellent, and waiters are orders of magnitude more polite than the ones we have in Paris.

Surprisingly, the crippling heat of the last few weeks turned to freezing downpour tonight, but we managed to stay inside most of the time. In front of the National History Museum, a team of workers that were restoring the pavement did not work yesterday because of the heat, and did not work this afternoon because of the rain. I wonder whether the pavement will be done by the time we leave.

Romania – Day 2



We ate here yesterday night.



View on modern buildings from victory square.



Association of romanian writers.



The boring Goerge Enescu museum.



University of Industrial Chemistry.



Pepsi is the leading soda brand in Romania.



Coca-Cola is fighting for brand recognition here.



The crest of Bucharest.



An apartment from the communist era.



My old elementary school.



I used to play around these parts a lot.



Titan Metro Station



James Bond's conspicuous parking spot.



A street in eastern Bucharest.



Beautiful Park in midtown Bucharest.



Artificial waterfall that fascinated me as a child.



A tree-like cement bridge.



Bust of Romanian Poet Mihail Eminescu.



It is getting late, still a lot to see.



Meat management, apparently.

Romania – Day 1

Châtelet les Halles

Roissy - Aéroport Charles de Gaulle

Otopeni - Aeroport Henri Coanda

Using OCaml from CouchDB views

What follows is directly taken from my latest GitHub project, which provides an adapter for transforming OCaml applications into CouchDB view servers. The programmer writes an OCaml application that exports one or more map and reduce functions using the API found in module CouchAdapter, and creates a CouchDB design document that specifies the application path and the name of the exported functions. The adapter server then receives evaluation requests from CouchDB and passes them to the application, and returns the result back to CouchDB.

The objective of this project is not to support writing OCaml code directly into views! The OCaml code should follow the standard build procedure, the only exception being that the CouchAdapter API is used to export that code and make it available to the adapter server.

Requirements and setup

This adapter uses json-wheel for representing JSON values, and the build process requires OCamlBuild. There are no other direct dependencies. Building the adapter server runServer is fairly straightforward: make byte or make native generates runServer.byte or runServer.native respectively. Move the resulting application to an appropriate location on your system and allow CouchDB to execute it. My suggestion is:

cp runServer.native /usr/bin/couch-ml-adapter
chmod a+x /usr/bin/couch-ml-adapter

I will be assuming this convention for the rest of this manual. Once the server is built and installed, you need to configure CouchDB to actually use that adapter to execute OCaml views. Edit the local.ini configuration file of your CouchDB server (usually found in /etc/couchdb/local.ini) and add the following lines:

[query_servers]
ocaml=/usr/bin/couch-ml-adapter

Depending on your configuration, there might already be a [query_servers] section. If that is the case, add the second line to that section. If you have trouble configuring your query servers, read the CouchDB documentation.

Errors that happen while executing the adapter will appear in the CouchDB logs (usually found in /var/log/couchdb/couch.log).

Architecture

Query Servers

The CouchDB server usually evaluates map and reduce functions only when a design document containing those functions is queried by a client, by following this process:

  • If the query server configured in local.ini is not already running, start it.
  • Send various instructions on the query server’s STDIN, such as “apply the map function F to document D”
  • Read the results on the query server’s STDOUT.

The Adapter Server

The adapter server provided by this project is one such query server. When it must apply a function to a document, it does the following:

  • Determine which application provides the function.
  • If the application is not already running, start it.
  • Send the request to the application’s STDIN, read the answer on its STDOUT.
  • If the application responds with results, send these back to CouchDB.

In short, the overall architecture looks like this:

+---------+         +------------------------+
|         | <-----> |  Haskell Query Server  |
|         |         +------------------------+
|         |
|         |         +------------------------+
| CouchDB | <-----> | Brainfuck Query Server |
|         |         +------------------------+
|         |
|         |         +------------------------+
|         | <-----> |                        | <-----> [ Application /home/nicollet/test ]
+---------+         |  OCaml Adapter Server  |
                    |                        | <-----> [ Application /usr/bin/foo ]
                    +------------------------+

The programmer should therefore write an application which reads the adapter requests on STDIN, runs the requested functions on the provided documents, and sends the results back on STDOUT. All the boilerplate involved is handled by the CouchAdapter module, so that the actual development process you will be following is:

  • Include any modules you might need to use in your view.
  • Define the map or reduce function as an OCaml function.
  • Register that function as being exported with CouchAdapter.export_map and CouchAdapter.export_reduce.
  • Call CouchAdapter.export()

Importing From CouchDB

CouchDB references map and reduce functions in design documents, using the following syntax:

{ "_id" : "_design/..."  ,
  "language" : "...",
  "views" : {
    "foobar" : { "map" : ... }
    "quxbaz" : { "map" : ... , "reduce" : ... }
  }
}

In order to use the OCaml adapter, one must first set the language property to "ocaml". Then, to reference the function "extract_foo" defined in application /usr/bin/foo, one would write:

"views" : {
  "foobar" : { "map" : ["/usr/bin/foo", 1, "extract_foo"] }
}

The same syntax applies for reduce functions as well. The three components of the definition are 1- the absolute path to the application that exports the function (this is how the adapter server knows what application to run), 2- a version number discussed in the next section and 3- the name under which the function is exported from that application.

Function versions

For performance reasons, once an application or query server has been started, it is never shut down. This only causes problems when there’s a new version of the code that needs to be deployed. The adapter server provides a versioning system which automatically detects that a function.

A CouchDB design document requests a function that is at least a certain version. For instance, ["/usr/bin/foo", 42, "extract_foo"] indicates that the adapter server should find version 42 or greater of the function "extract_foo" exported by application usr/bin/foo. If that application is currently running and the function is either missing or older than version 42 then the application is shut down and started anew in a completely transparent fashion.

Note that if rebooting the application still fails to provide an appropriate version of the function, the adapter server will report an error, which CouchDB will propagate to the client. This makes all the views inside the design document unavailable until an appropriate version of the application is deployed.

Failing to manage function versions both in CouchDB and in the application can lead to data inconsistencies, as different documents are processed by different versions of the same function. Only a global version change which prompts a full refresh of the view and reloads the application can ensure data consistency in the face of code changes.

Creating a map function

A map function must follow the signature json -> (json * json) list: the argument is the entire document being processed, and the output is a list of key, value pairs being output by the map function.

For example, suppose you already have an User module in your application, which is used among other things for reading and writing users to the CouchDB database:

type t = {
  active  : bool ;
  name    : string ;
  email   : string ;
  picture : string
}

let of_json = (* ... *)
let to_json = (* ... *)

Then you can rely on that module to define a map function with the above signature, and export it using the CouchAdapter module:

open Json_type

let user_by_email json =
  try let user = User.of_json json in
      [ String user.User.email , Null ]
  with _ -> []

let () =             

  CouchAdapter.export_map
    ~name:"user_by_email"
    ~version:1
    ~body:user_by_email ;

  CouchAdapter.export ()

Should you decide to update the view code, make sure that you also increment the version number:

open Json_type

let user_by_email json =
  try let user = User.of_json json in
      if user.User.active then [ String user.User.email , Null ]
  else []
  with _ -> []

let () =             

  CouchAdapter.export_map
    ~name:"user_by_email"
    ~version:2
    ~body:user_by_email ;

  CouchAdapter.export ()

Creating a reduce function

There is no distinction made between reduce and rereduce. While this causes a slight loss in functionality it also makes writing reduce functions less arduous given the OCaml type system. The signature of reduce functions is simply json list -> json.

For example, let’s assume that an Article module is already defined in your main application:

type t = {
  title : string ;
  html  : string ;
  tags  : string list
}

let of_json = (* ... *)
let to_json = (* ... *)

We now define a map function and a reduce function that counts how many articles are published for every tag.

let by_tag_map json =
  try let article = Article.of_json json in
      List.map (fun tag -> String tag , Int 1) article.Article.tags
  with _ -> []

let by_tag_reduce json =
  Int (List.fold_left (fun acc -> function Int i -> acc + i | _ -> acc) 0 json)

let () =
  CouchAdapter.export_map "by_tag-map" 1 by_tag_map ;
  CouchAdapter.export_reduce "by_tag-reduce" 1 by_tag_reduce ;
  CouchAdapter.export ()

And the CouchDB design document is as follows:

{ "_id" : "_design/article",
  "language" : "ocaml",
  "views" : {
    "by_tag" : { "map"    : ["/path/to/app", 1, "by_tag-map"    ],
                 "reduce" : ["/path/to/app", 1, "by_tag-reduce" ] }
  }
}

Article image © Miriam Rossignoli — Flickr

Coping With Inconsistent Databases

In my earlier article about the benefits of NoSQL, I discussed eventually consistent databases. These are databases where « write A ; read A » can return an outdated or missing value, but « write A ; wait ; read A » will always return the correct value if you wait long enough. Dealing with eventual consistency can lead to bugs, because there are many pitfalls caused by race conditions. It’s impossible for anyone to avoid race conditions by reading the code and thinking very hard about it. Instead, the code must be written using patterns and mental tools that by their very design prevent race conditions from happening. My point was that most programmers that only had experience with the absolute-consistency SQL world do not have the mental tools necessary to avoid those pitfalls. Not because they are incapable of it, but because they never had the training or the experience to acquire these mental tools.

Today, an anonymous coward shared a few thoughts on the topic :

They do not have the mental tools required to work with eventual consistency?
The only mental tool I’ve seen is disregard for the issue.
Waiting eagerly on another post discussing those “mental tools”.

He/she is right, what are those mental tools anyway ?

First, let me state the obvious again : eventually consistent databases almost never remain inconsistent long enough for users to notice and, even if they do notice, they usually don’t care — through the prevalence of cache-powered websites, our users are used to seeing stale data every so often and know to hit the refresh button to deal with it. Aside from a few critical edge cases like online payment processing, the problem with eventual consistency is not the user.

The problem is that software makes decisions based on available data and, if the available data is wrong, then the outcome is wrong. This decision-making process will turn a one-nanosecond inconsistency into a permanent error if you are unlucky, and the entire point of this article is how to prevent this from happening. Need an example?

Event-Based vs State-Based

Let’s say I’m writing a badge module similar to the one used on Stack Overflow. Here are the specifications:

The user can publish articles. Their 10th article will bear a bronze badge, their 50th will bear a silver badge, and their 100th article will bear a golden badge.

One way I can write this module is to intercept the «publish article» event and add my own bit of logic to it: if there are nine other articles, award the bronze badge. This is an event-based approach, because it performs some changes when an event happens. This way of doing things is almost universally followed in the SQL world, but it does not work in NoSQL environments that lack absolute consistency.

What’s the problem? One user, Bob, tries to cheat the system by publishing nine articles, then publishing articles X and Y in quick succession, hoping to get bronze badges for both. The behavior we want is that X should have the bronze badge and Y should not.

  • If absolute consistency is guaranteed, then Y will be published when the database already knows that X has been published, it will be the 11th and thus will not receive the badge.
  • If only eventual consistency is guaranteed, then Y might be published before the existence of X has been acknowledged : both articles would receive badges.

The alternative is to use a state-based architecture where «On EVENT apply CHANGE» is replaced by «If STATE-A then STATE-B» : instead of «On publishing the tenth article, award badge» the system uses «If this is the tenth article, then it has the bronze badge.» Where an event-based solution would apply the CHANGE and move on, the state-based solution instead examines STATE-A whenever someone asks for STATE-B and applies the rule every single time.

Going back to Bob’s problem : if you ask a few nanoseconds after both articles are published «Does article Y have the bronze badge?» then the answer will still be «Yes» because eventual consistency takes a short while to set in. But if you ask the same question a few seconds later, then article Y will be correctly known as being the 11th article and the answer will be «No»

An application that is entirely based on state-based rules can work with an eventually consistent database without ever having permanent errors — by definition, any errors would only last as long as the underlying inconsistencies remain. In practice, from my experience with CouchDB, all temporary errors are gone after a couple of seconds in the very worst case, and it’s usually gone before that.

But state-based rules do mean that whenever the application needs to know STATE-B, it must read STATE-A and apply the rule again. Does this mean that I will have to count the articles (a potentially costly operation) whenever I need to know if a given article has the bronze badge? This is pure insanity!

State-Based Caches

The NoSQL answer is «Cache it!»

In fact, I will go even further: a NoSQL-friendly architecture eliminates several downsides of caching while keeping all the performance benefits, in ways that no event-based SQL solution can.

  • Staleness of cached data is not an issue: the software is already designed to deal with eventual consistency and a cache is just another kind of eventually consistent data source. Unlike traditional software that relies on absolute consistency, NoSQL-friendly applications can make business decisions based on cached data without any risk.
  • Dependencies between STATE-A and STATE-B are usually first-class citizens of the application source code, so when a state change happens it’s easy to follow the threads and invalidate all the dependencies. The application can rely on invalidation instead of timeouts to keep the cache up-to-date.
  • Most NoSQL solutions already provide some level of caching. For instance, counting the number of published articles in CouchDB is a constant-time cached operation, and the database keeps the cache up-to-date without developer intervention. In fact, manual caching is almost never a requirement for simple rules in CouchDB — and even then, the database provides a “last changes” real-time feed that the developer can use to make cache management easier.

It interesting to note that several common patterns in SQL event-based applications are in fact poor implementations of a caching strategy for a state-based rule. An upvote/downvote system such as the one Reddit uses involves storing both the number of votes in the item table, and the individual votes in an user-comment association table — the former is used to quickly determine the current score of an item, while the latter is used to prevent people from voting several times. The state-based query implemented here is :

SELECT SUM(score) FROM votes WHERE item_id = ?

However, the naive event-based solution is to intercept “upvote” and “downvote” events and perform this query instead:

UPDATE item SET score = score + 1 WHERE item_id = ?

This is done in the hopes that the sequence of of +1′s and -1′s will remain equivalent to the original state-based query, which is only the case if upvotes and downvotes are the only events that affect the votes table. If, say, banning an user account retroactively deletes all the associated votes, it would take another ad hoc query to keep the cache correct. Maybe something like this:

UPDATE item NATURAL JOIN vote SET score = item.score - vote.score
WHERE vote.user_id = ?

This is because of a fundamental difference between event-based and state-based designs : if your value actually depends on the state, then it takes one state-based piece of code to compute it, but it takes one event-based piece of code for each possible event that could ever affect it.

And even then, you still have to write the state-based update code because you will need to run it to rebuild the cache whenever something goes wrong.

Typical State-Based Architecture

There are three kinds of rules in any application :

  • State-based rules : when this value is X, that value is F(X). Most indirect consequences of user input are here.
  • Event-based input rules : when this event happens in the real world, do X. This could be caused by user input, or when communicating with a third party API.
  • Event-based output rules : when this happens in the application, perform X in the real world. The classic example is sending an e-mail, but this covers pushing any kind of data to anyone outside your application.

State-based rules can be handled natively.

Input rules are usually handled by performing an atomic, non-conflicting write to the database whenever the event happens — it should be done in such a way that no conflict can happen after the event has passed. One solution is to simply create a new document with an unique identifier every time an event happens: unique identifiers prevent conflicts, and you can then rely on state-based rules to aggregate a sequence of events into a more coherent current state. In my current project, every notification received from PayPal is appended to a database, and a state-based rule aggregates those notifications into a pending-failed-successful state for every transaction. As an added bonus this solution also provides a history (the list of related events) and the possibility to cancel events by deleting the corresponding document in the same way that one can revert a Wikipedia article to a previous version by removing the corresponding diffs.

Another solution for handling input rules is useful when the user sets a value — what matters to the user is the resulting value, not the operation that resulted in that value. If setting this value can be done by an atomic, non-conflicting update, then do so. Keep in mind that if you use CouchDB master-master replication, then updates are not non-conflicting !

Output rules are trickier. If you are lucky, your output rule is in fact tied to an input event such as «When you click this button, I will ask Paypal for your money» and this can in fact be handled as a normal input rule that just happens to query a third party API for more input data.

Application-initiated output events involve creating an entry that represents the outgoing event before it happens, with a timestamp of the moment the event should happen, appropriately set some time into the future. That entry is then managed by standard state-based rules that can alter it or disable it as part of the corresponding source data eventually becoming consistent. The delay should be calculated to ensure that the database does become consistent, and a delay of few minutes is not a problem because the action was not initiated by the user. Once the delay expires, the application reads back the entry and performs the output action if it is still appropriate.

Back to Bob’s articles : let’s say the specifications require that I send Bob a congratulatory e-mail whenever an article gets a badge. Be cause he cheated, the state-based rule determines mistakenly that Bob’s articles X and Y both received a bronze badge, so it creates two entries in the «congratulatory e-mail» section, both set one minute into the future.

The trick here is that the identifiers of those entries are something along the lines of “Bronze-Badge-Y” so that applying the state-based rule several times merely updates the same entry instead of creating a new one every time. After a few seconds, the eventual consistency catches up with Bob and article Y loses its bronze badge status. The rule-based system detects that the “Bronze-Badge-Y” entry needs to be updated and marks it as «do not send».

User Uncertainty

Earlier, I skimmed over the fact that users don’t care about eventual consistency. There’s one exception to this rule — when you’re asking users to make a decision based on data you are showing them, you cannot afford to go wrong.

If you ask your user whether they wish to pay $100, and you bill them $101 instead because the price changed in the database while the user was reading the confirmation form, then you have a problem.

This problem, however, is not specific to the NoSQL eventual consistency world. In fact, the average SQL application has the same problem: it’s impossible to start a transaction, show the user a confirmation form, and only end the transaction when the user confirms. Transactions do not work that way. Instead, both SQL and NoSQL solutions must resort to a conflict detection strategy: when the user confirms, check whether the user’s decision is still compatible with the application state and if it isn’t, show them an error message — «Sorry, the price just went up to $101, do you still want to go on?»

It is possible to detect conflict using state-based rules in an eventually consistent database: entry A, created when the user confirmed the payment, states that $100 should be billed, but entry B created a few seconds before entry A states that the price is now $101. The problem is that it might take a short while for entries A and B to be processed together, but we need to show a confirmation page straight away…

You have two possibilities here. The first is the most obvious one: have the user wait until the eventual consistency kicks in and you can genuinely confirm their purchase; you may optimise your NoSQL usage to make that delay shorter, such as by avoiding master-master replication on that particular database.

The second possibility, for which I have a personal preference, is to provide an answer straight away, but reserve the right to deny that decision later. This means that in 99% of the cases, there is no conflict and the user does not have to wait. In 99% of the remaining cases, the user waited long enough on the confirmation page that the conflict is detected straight away. It really takes a stroke of bad luck for the user’s decision to happen precisely as the situation changes, so having to cancel in those specific cases is acceptable, especially since your state-based architecture can handle the cancellation quite well.

This is no different than having to cancel an e-commerce order because the ordered item was lost at the warehouse — the computer said yes, but reality said no.

TL ; DR

  1. An UPDATE is permanently inconsistent if it was based on temporarily inconsistent data.
  2. The result of a CREATE is never permanently inconsistent.
    So, don’t UPDATE objects, CREATE object modifications.
  3. To get the latest version of an object, apply a map-reduce algorithm to the modifications.
  4. You should cache data, the cache must be re-calculated whenever the underlying data changes.
  5. Some UPDATEs are in fact hidden cache refreshes. Use a normal cache instead.
  6. When affecting the outside world, wait for the eventual consistency to kick in before you act.
  7. Conflicts can affect users, but only rarely. Plan your UI accordingly.

Article Image © Chris Dlugosz — Flickr



1342 feed subscribers
(readers who polled a feed this week)