Sparse grids
Consider a bidimensional grid filled with values. The most optimal way of storing such a grid in memory is to use an array: a large span of contiguous memory big enough to contain all the values one after another. Accessing the value at coordinates (x,y) is done by accessing the index (x * height + y) of that array. This is how C and C++ handle multi-dimensional arrays (either the native static ones or those provided boost::multi_array).
Consider now a mostly empty bidimensional grid: one where less than 10% of the values exist (the others being filled with NULL, None or whatever does-not-exist option value you choose to use). By storing such a grid as a contiguous array, more than 90% of the allocated space would be wasted for nothing. This situation calls for a different data structure: a sparse grid. Sparse data structures are implemented in such a way that they use less memory when few values exist in a grid, at the cost of a small computing overhead and a large memory overhead when the grid is nearly full.
This article will provide a simple sparse grid implementation based on the concept behind kd-trees.
And now, a word from our sponsor
The chosen implementation is immutable: this means that once a grid is constructed, it cannot be modified. What is possible, however, is to create a new grid from the old one, with slight differences (such as one more element or one less element). Immutable data structures are to functional programming what mutable data structures are to imperative programming. Where an imperative-style program would write:
data[i] = new_data_value;
A functional-style program would write:
let new_data = set old_data i new_data_value
Therefore, a functional-style program gets the new data structure while also keeping the old version. This has very clean applications to exception safety and transactional operations, since rolling back is as simple as throwing away the new value. Plus, it gets all the advantages of referential transparency.
But doesn’t all this copying around create problems both in terms of memory usage and in terms of runtime performance? Not really: as will be obvious in this article, the overhead related to functional manipulations on tree and list structures is minimal. In terms of Objective Caml code, the interface of our grid will be the following:
type 'a grid val empty : int -> int -> 'a grid val size : 'a grid -> int val get : 'a grid -> int -> int -> 'a val set : 'a grid -> int -> int -> 'a -> 'a grid val unset : 'a grid -> int -> int -> 'a grid
The empty function creates an empty grid of predetermined dimensions. The size function returns the number of elements in a grid, and get returns the element found at a given position in a grid (or raises Not_found if not set). The set function returns a copy of the grid containing a certain element at a certain position, while the unset function returns a copy of the grid with no element at that position.
The algorithm
Before moving on, I strongly suggest that you get yourself familiar with the notion of kd-trees (for instance, from this wikipedia article on the subject). The key idea is that every node of a kd-tree splits the universe in two halves, with each half being represented by one of the two sub-trees.
Given an empty grid, we choose the virtual tree structure as follows: one splits the grid in two halves of equal size across its longest dimension, then recursively splits the two halves, until the obtained grid contains exactly one tile. However, this is a virtual structure: representing it in memory would be wasteful, since most of the leaves will be empty anyway. Therefore, the program will only display the parts of the tree that are important: the parts of the tree that contain no elements in their leaves are simply represented by a ‘hole’ value, instead of containing the full set of nodes and the empty leaves.
Preliminary definitions
To achieve this, it’s first necessary to define a few preliminary types and functions. This starts with the tree type, which describes the various slicing steps to reach a certain grid element.
type 'a step = | Split of 'a step * 'a step | Element of 'a | Hole
This tree contains no position information: indeed, that information is already available as part of the virtual tree (which was not constructed, but which can be computed on the fly from the width and height of the grid). Therefore, traversing this tree looking for a position will involve passing the width and height around. To factor out all the traversal code can be done with a function :
let seek ~ifl ~ifr ~ifh ~ife (w,h) (x,y) step = let aux (w,h) (x,y) = function | Split (l,r) -> let t = w/2 in if x < t then ifl (t,h) (x,y) l r else ifr (w-t,h) (x-t,y) r l | Element e -> ife e | Hole -> Lazy.force ifh in if w >= h then aux (w,h) (x,y) step else aux (h,w) (y,x) step
This function looks at a node in the tree and returns a certain result based on what type of node it is. In the case of an existing node with a left son and a right son, it compares the sought after position with the coordinates of the separator between the children, and then calls the expected function (ifl for a left child, ifr for a right child) with the dimensions of the child grid and the coordinates within that grid. In the case of a leaf node, it calls the expected function (ife for an element) on the element itself. In the case of a grid node that’s missing (because there are no populated children), it returns the expected value (ifh for a hole). The latter is a lazily evaluated value, because one might wish to raise an exception (or otherwise avoid to evaluate it) only if it’s actually used.
The clever part of this function is the x-y swap transform. Indeed, the tree remains the same if you swap the x and y coordinates (as well as the corresponding dimensions of the grid). Since splitting always occurs along the longest edge of the grid, swapping the values to ensure that the width is always larger than the height allows the factoring of the splitting code.
Using this helper function, it’s possible to write three specific helper functions for getting, setting and unsetting:
let rec get_aux dim pos step = seek ~ifl:(fun dim pos step _ -> get_aux dim pos step) ~ifr:(fun dim pos step _ -> get_aux dim pos step) ~ifh:(lazy (raise Not_found)) ~ife:(fun x -> x) dim pos step let rec set_aux dim pos step x = seek ~ifl:(fun dim pos l r -> Split(set_aux dim pos l x, r)) ~ifr:(fun dim pos r l -> Split(l, set_aux dim pos r x)) ~ifh:(lazy ( if dim = (1,1) then Element x else set_aux dim pos (Split(Hole,Hole)) x)) ~ife:(fun _ -> Element x) dim pos step let rec unset_aux dim pos step = let clean = function Split (Hole,Hole) -> Hole | other -> other in seek ~ifl:(fun dim pos l r -> clean (Split (unset_aux dim pos l, r))) ~ifr:(fun dim pos r l -> clean (Split (l, unset_aux dim pos r))) ~ifh:(lazy (Hole)) ~ife:(fun _ -> Hole) dim pos step
The get algorithm is fairly straightforward: always go into the side which contains the sought coordinate, raising an exception if it’s a hole and returning the value if an element is found. If inlined, this function also happens to be tail-recursive.
The set algorithm digs into the tree looking for the coordinate it’s trying to set. If it encounters a hole, either the hole is small enough (and it is then filled with an element) or it’s too large and the algorithm creates a new split node at that position and digs further.
The unset algorithm searches for the correct element and, once found, replaces it with a hole. It then moves back up the tree, replacing split nodes with two hole children with a single hole node, thereby removing unused areas of the grid from the tree.
Glue code
Once the helper functions are implemented, only a little glue is required to implement the module signature:
type 'a grid = { d : int * int ; s : 'a step } let test g x y = let w,h = g.d in if x < 0 || x >= w || y < 0 || y >= h then invalid_arg "Out of bounds" let empty x y = { d = x, y ; s = Hole } let size g = size_aux 0 g.s let get g x y = test g x y ; get_aux g.d (x,y) g.s let set g x y v = test g x y ; { g with s = set_aux g.d (x,y) g.s v } let unset g x y = test g x y ; { g with s = unset_aux g.d (x,y) g.s }
A test function ensures that the requested coordinates are valid, and the functions merely forward their arguments to the appropriate helper function.
The end result
Let us look at overhead versus a plain array. First, assuming that the grid is not sparse (all values are provided), then the resulting tree would contain the entire set of values, plus that many leaves, plus that many internal nodes, which weight much heavier
Of course, if the system is sparse, then the reverse is true. With zero elements, there is almost no memory usage. With a single element, the amount of memory used is logarithmic in the area of the grid, and this increases sub-linearly with the number of stored elements. This means that, if storing 100 elements in a 100×100 grid, the used memory would be below 1300 nodes worth of memory (and increasingly lower as the elements were clustered together), as opposed to the 9900 useless grid locations in a non-sparse implementation.
What about modifications? Again, adding or removing an element in a 100×100 grid results in at most 13 new nodes being created (since all the other nodes are not modified, and can thus be reused as-is from the previous grid).
How does this compare to the core Map module of Objective Caml ? It depends. On the one hand, the depth required is a function of the grid area, instead of the number of elements, making the sparse grid marginally more memory-hungry for very small amounts of values (but those would be better stored as an associative list anyway). On the other hand, the sparse grid is always balanced (never requiring the need for balancing like Map does) and does not need to store the keys, since those are made implicit by the slicing. Finally, the sparse grid tends to be adapted to clustered situations which reduce the effective depth of branches, whereas a Map is adapted when the grid elements used are uniformly spread out.
Hi. I'm Victor Nicollet,
Recent Comments