Perhaps one of the single most useful aspects of functional programming is the absence of side-effects. In a pure functional language, there is no assignment — once created, you cannot change the members of an object or the value of a variable. All you can do is create new values based on old values. Advocates of these so called immutable values argue that there are many benefits related to what you can do with them: you can keep around both the new version and the old version (implementing an “undo” feature of sorts), often at a reduced memory cost due to sharing of sub-elements between the two versions.
While I agree with these, I believe there’s something far more interesting about immutable values — what you cannot do with them, and the consequences this has on feeling secure about your code.
In my current project, I use a coding style that absolutely forbids any form of side-effects with the exception of 1° a clearly delimited set of create/update functions that affect the database, 2° any side-effects that affect local variables and 3° during initialization. These are of course exceptions — while allowed, they are avoided in the vast majority of the code.
Reordering code
One simple consequence: the order in which calls are made is only constrained by things that the compiler can detect — the function that uses x is called after the function that returns x — instead of subtle «open must be called before send» errors, simply because there’s no way for «open» to affect «send» unless the former returns a value used by the latter. So, I can rearrange lines freely because the compiler will warn me when I make a mistake.
Protecting against invalid states
A corrolary of the above is that it becomes possible to design your types so that only valid states can be represented. For instance, consider a list that must contain at least one element — in a mutable context, there is no way to represent this within the type system because I could always write this:
for (i = 0 ; i < 10 ; i++)
list.pop();
There is no way here for the pop function to indicate that it might cause a problem, because it returns nothing — the only way out is an exception, and this happens outside the type system in most languages. On the other hand, if we keep the original list unmodified and return the new list, this opens the opportunity for indicating whether additional pops are available:
type 'a one_or_more =
| One of 'a
| More of 'a two_or_more
and 'a two_or_more = 'a * 'a one_or_more
let pop : 'a two_or_more -> 'a one_or_more = function (_,rest) -> rest
By design, the pop function will never attempt to pop from a one-element list because it can only be called on a two_or_more value, and it returns a value which may or may not be a two_or_more — the code needs to check this, and the compiler will detect if you forget it:
let pop_n n list =
if n = 0 then list else pop_n (n - 1) (pop list)
Here, the type of the list variable is inferred to be two_or_more (because it’s passed as an argument to pop) which means that pop list (of type one_or_more) is not a valid argument to pop_n. An appropriate solution (popping up to n, if available) takes this into account:
let pop_n n = function
| (One _) as keep -> keep
| more -> if n = 0 then more else pop_n (n - 1) (pop more)
Let’s summarize what happened here: I used a different representation for one-element lists and more-than-one-element lists, which let me selectively define a pop operation only for more-than-one-element lists, and this allowed the compiler to detect when the «must contain at least one element» constraint might be violated. Have fun achieving this in your non-functional language of choice! Using a different representation in this way was made possible by the fact that pop returns a new list instead of altering its argument — so the new list can be of a different type than the old one.
This is a fundamental principle in compiled functional languages: if a given state is not valid, design your types so that it may not be represented. When this is done correctly, not only does it become harder to write code that tries to create an invalid state, but when that code is written, the compiler rejects it.
Memoization
If you are certain that a given function is pure but involves computations that you would rather not repeat too often (such as database reads), you can resort to memoization: wrap the function in a piece of code that memorizes the result of every call. In my current project, this is as simple as:
let get_pic_url = Util.memoize Picture.get_url
From then on, asking ten times for the URL of a given picture will result in a single database request. Had the function involved a side-effect, memoization would have changed the program behavior: knowledge that the function is pure lets me apply memoization without any risk of creating a bug.
Retry until successful
Yet another situation is when dealing with CouchDB transactions, which involve the following process:
- Fetch current version of the document from the database
- Modify the document
- Write the document back to the database
- If a collision happened, repeat from 1.
In short, the application must repeat steps 1-2-3 until a write succeeds. Correctness of these algorithms is difficult to test because development servers seldom have enough active users to actually cause conflicts (and when conflicts do happen, they are hard to trace). By deciding that steps 1-2-3 are handled by pure functional code, I have the guarantee that the loop can be executed any number of times and yield the expected result. Right now, a transaction looks like this:
let accept id = DB.transaction id
begin function
| None -> (* No object in database *) DB.Keep
| Some post ->
if post # accepted then
(* Post is already accepted *) DB.Keep
else
(* Accept post *)
DB.Put (object
method accepted = true
method author = post # author
method text = post # text
end)
end
By applying a “never use side-effects” convention, I know without even looking at the function contents that there are no side-effects inside.
These four examples are original in that they don’t try to showcase the power of immutable values and how they can be used. Instead, they simply illustrate how a program without mutable values has interesting properties that make refactoring easier, help the compiler detect many more errors, and allows the application of patterns that can only be safely applied in the absence of side-effects. Like the monks of old, I write my software without the mundane facilities of mutable values, but I derive significant benefits out of it.
Recent Comments