Monthly Archive for February, 2009

Beginning OCaml, Part 1

Perhaps you’ve never programmed. Or perhaps you know of other languages, and wish to learn others—this post will discuss only pure functional programming, so I suggest you forget everything you know and start anew. Either way, this series of articles provides a quick introduction to the functional side of Objective Caml.

Expressions

The vast majority of code you will write in Objective Caml will be expressions. An expression is usually a mathematical formula or something close to one. The simplest expression is a constant, such as:

1

This is not a very useful expression. A slightly more useful example would be using Objective Caml as a pocket calculator, with expressions like:

(2 + 4) * 3

Objective Caml runs expressions by computing their value: this is known as evaluation. The above expression would result in the integer value 18. As we discover more elements of the language, the rules for evaluation will become more complex, but the fundamental principle remains almost intact: an Objective Caml program is an expression or series of expressions, running the program means evaluating those expressions.

Variables

The first step in raising Objective Caml above the lowly pocket calculator level is the ability to give names to objects.  The language construct for doing so is borrowed from mathematics, where mathematicians would say “let x be the smallest integer such that …” and refer to that integer as “x” from then on. The syntax for doing so is:

let (variable) = (expression) ;;

Use the semicolons for now—we will see later on that they are optional in some circumstances. Example of such a definition:

let x = (2 + 4) * 3 ;;
x - 2

This will evaluate the expression (which yields the value 18) then bind that value to the name x. Every expression below the definition of x will know that x equals 18, so the second expression would evaluate to 16. Here, x is called a variable—a confusing name, since it does not actually vary: once it’s defined, it stays forever. Any name can be used for a variable, as long as it starts with a lowercase letter and contains only letters, numbers and underscores.

It is of course possible to define more than one variable in a program:

let x = 3 + 4 ;;
let y = 2 * 5 ;;
x + y

This example evaluates to 17. A normal Objective Caml program defines hundreds and even thousands of variables in order to work.

Having a thousand variables creates a risk for collision. The good news is that collisions are handled smoothly: at any point in the program only the last definition found so far counts. So, if we were to consider an example:

let x = 3 ;;
let x = x + 1 ;;
x + x

Line 1 defines x as 3, line 2 defines x again, this time as 3 + 1 = 4, and line three evaluates to 4 + 4 = 8.

This doesn’t solve the problem entirely, though: what if I accidentally overwrite a previously defined value, and I need that value later on?

Local Definitions

Objective Caml solves this by providing local definitions: instead of making a variable available to all lines that appear below it, the variable is only available within an expression. The syntax is:

let (variable) = (expression) in (expression)

The variable exists only within the second expression, and uses the value of the first expression. For instance:

let x = 1 ;;
let y =
  let x = 2 in
  x + x ;;
x + y

This example defines x as 1. Then, within the definition of y, it defines x again as 2, which makes the value of y equal to 2 + 2 = 4. However, the second definition of x is only available within the “x + x” expression, so the “x + y” expression uses the original value, and the result is “1 + 4 = 5″.

Local definitions are expressions. This means that they can be used as part of other expressions, such as on either side of a mathematical operator, or within other definitions. It’s perfectly legal to write code like:

1 + (let x = 1 + 2 in x * 2)

This evaluates to 1 + (3 * 2) = 7.

In practice, most variables are defined locally, and only the most important variables are defined globally.

Functions

While useful, the above features still don’t get very far beyond basic calculation needs. The one feature that turns Objective Caml into a highly expressive tool is functions. A function follows the mathematical tradition of being a mapping: it associates every element of a set with an element from another set. The element from the first set is the argument, and the element from the other set is the return value.

To call a function, you provide it with an argument and it is automatically turned into its return value for that argument. For example, the function “string_of_int” returns a bit of text representing an integer, so you could use it like so:

string_of_int 10

This would evaluate to the text “10″.

How do we define functions? Well, we simply write an expression which uses the argument and evaluates to the return value. Of course, the function doesn’t know the value of its argument until it’s called. So, when we write the function’s code code, we use a placeholder name to represent the argument: this variable is called the parameter, and it is replaced with the argument itself when the function is called. The syntax is:

fun (parameter) -> (expression)

For instance, a function that adds two to a number can be defined as:

fun x -> x + 2

It is not very useful as such, so let’s give it a name and call it:

let add = fun x -> x + 2 ;;
add 3 * add 4

This evaluates to 5 * 6 = 30, and illustrates the main point of functions: they allow you to describe an operation once, and use it in many places simply by using the function’s name.

The Objective Caml language provides two shortcuts for defining functions. The first was designed to write functions that returns functions. Suppose that I write:

fun x -> (fun y -> x + y)

This can be elegantly shortened to:

fun x y -> x + y

The second shortcut was designed to make the definition of named functions easier. Suppose that I write:

let add = fun x -> x + 2

This can be elegantly shortened to:

let add x = x + 2

The two shortcuts can be combined, so that:

let add x y = x + y

Means the same as:

let add = fun x -> (fun y -> x+y)

You can see all the above (along with the general structure of the tutorial) on this page.

Configuration Files

Configuration files are omnipresent in modern computer architecture. They usually appear as editable text files that are loaded (more or less dynamically) by a process while it’s running. There are many design choices that separate good configuration files from bad configuration files.

Location

A very important part of configuration is where the configuration files can be found. Good configuration setups store their files in locations as conventional as possible:

  • The root directory of the program, with an explicit name (usually containing “config“). This is usually a Linux convention, but a lot of PHP projects tend to follow it.
  • An “etc” directory within the root directory of the program. Alternatively, you may store the system-wide configuration elements in the absolute “/etc/program” path, as long as an alternate target is provided.
  • If the program is aimed at Linux users, per-user configuration is usually expected within the home directory, prefixed by a dot (either a directory, such as “.emacs.d“, or as a single file, such as “.muttrc“).
  • On windows, the registry is advisable for user programs (store global configuration as global keys if allwed to, and store per-user configuration as per-user keys).
  • In Java, “foobar.properties” is usually expected to be in the same path as the class “foobar” for which it was defined.

Hiding configuration files in other locations is possible, but I would advise against it—it forces administrators, developers and installers to hunt for the files (either on their file system, or through documentation).

It is also possible to store configuration information in the database—this raises the question of where configuration stops and runtime data starts, especially in modern systems like WordPress or Magento that allow heavy-duty reconfiguration of the system through a back-office HTTP interface. This can also be quite annoying at times: for instance, if your development model calls for many developers writing and testing code on their own machines with a shared database for all developers, a system like Magento won’t do because it stores local information (such as the access URL) in the database.

Timing

Another important element is the time when the configuration file is taken into account by the system. There are, mostly, three ways of dealing with configuration files:

  • Load at initialization time. When the system or program boots, it reads and parses the configuration file. The advantage of this system is that if the configuration file is broken, the system won’t start, so it’s quite probable that the system administrator will be on hand to correct things as required. Besides, it’s also quite easy to implement. The downside is that you have to restart the system to take changes into account.
  • Load on demand. The file is loaded at initialization time, but can be reloaded on demand (the classic way on a Debian box is to “/etc/init.d/program reload“, for instance). Should an error appear in the file, the reloading fails with an error but the old configuration is kept so that the program keeps running while an administrator corrects the error. This way is harder to get right, especially since it requires a communication channel to signal the change in configuration to the program.
  • Load on every request. This is the case for configuration files that affect the behavior of a frequently occurring action (such as receiving an HTTP request for an Apache server). Whenever the action is performed, the configuration file is reloaded. The advantage of this solution is that there is no manual reloading to be done. The downside is that the configuration will not be tested until the action is performed, which might happen a while after the administrator left (of course, in a perfect world people would test any modifications they make on a live system before leaving).

Let’s not forget the issue of synchronization: does a configuration file reflect the current configuration of a running program? In the first two cases, it doesn’t: an administrator could forget to restart or reload the system, leading to a system that uses the old configuration and a configuration file that contains untested modifications, and rebooting such a system is certain death. Some administrators go as far as rebooting a system once per night—the ability to come back online quickly and correctly is taken as a sign of good health of the system.

My personal advice on this is to use load on demand. This improves performance over load on request (as there is no need to reload configuration every run) as well as safety (you immediately know if something went wrong) and to compensate for the synchronization issue by periodically checking whether the currently-enabled configuration is older than the configuration file itself, and issuing warnings if it is. Keep a backup of the current configuration somewhere in case you need to reboot.

Another solution for the synchronization issue is to combine editing and reloading. This is what crontab does: it edits the file containing the jobs, checks for validity, and then signals the crond daemon to trigger a reload. The approach is also found in online administration tools (that only commit a modification once it’s been validated, and do not save uncommited modifications at all). Requiring to go through an editor, however, reduces interoperability as it prevents non-human users (such as IDEs, installers, administration consoles or other third party programs) from modifying the configuration.

Syntax

There are usually three main ways to design a configuration file itself.

  • Creating your own language. This one is extremely frequent in the UNIX world. It has the benefit of allowing maximum expressiveness, as the developer can tailor a configuration language to stick as closely to the problem domain as possible. The downside is that it requires administrators to learn yet another scripting language. As a Linux sysadmin, I routinely have to deal with a lot of languages, such as:
    • the general-purpose awk and sed
    • makefiles
    • crontabs
    • shell configuration files (.zshrc, .bashrc)
    • php.ini
    • apache host definitions, .htaccess and httpd.conf
    • the sudoers file
    • .qmail
    • .emacs.d
    • .muttrc
    • .flrnrc
    • /etc/fstab
    • /etc/passwd (arguably, I can read it faster than I query it with the appropriate programs)

    Besides, don’t really expect your IDE or text editor to provide verification and syntax highlighting for your more obscure scripts.

  • Using XML, with an appropriate schema or DTD. While fairly verbose and often ill suited to configuration tasks, XML is a standard. This means that it’s easy to find validation tools (including, of course, editing it with your IDE and knowing ahead of time if errors are present), parsing tools (so that you don’t have to write a loading module yourself) and transform tools in the form of XSL. Of course, not everything can be validated, and not every configuration file has a corresponding DTD—sometimes, developers use XML for the “easy to parse” benefit and don’t really care about who will be writing the configuration.
  • Using a programming language. This works fine with dynamic languages that can interpret themselves on the fly, and is most often encountered in PHP (the Mantis bug tracker uses this approach, and the configuration file, written in PHP, is merely a set of assignments to global variables). This is extremely efficient: no parser is required at all, there’s already full support for the IDE, the format is much more flexible if you call APIs that are allowed to you, and you don’t have to learn another language either. The downside is that there’s a lot of freedom allowed, meaning that a meddling user could break the system if they were allowed to provide a configuration file—so, no userland configuration can use this approach on pain of death without heavy sandboxing.

This choice is yours, as I have no advice to give here: XML can be nice, but overly verbose, custom formats always need you to learn them, and using the programming language itself can be dangerous because it’s Turing-complete.

Can one imagine a replacement for XML? A language with turing-complete verification rules, a nicer syntax, and even easier parsing? Or even an improvement on XML schemas that could perform more in-depth checks? I think this deserves some more thought.



1170 feed subscribers
(readers who polled a feed this week)