Tag Archive for 'Algorithms'

Interest(ing) rates

The most common way of investing money is putting it in a savings account. You lend a fixed amount of money to someone, and they pay interest over that money at a predetermined rate. Let’s say you lend 1,000 € at an interest rate of 3%, paid every year: at the end of the year, you would receive 30 € as payment for your lending. You would spend these on fine wine or nice clothes and wait until the next year to get another 30 €, and so on.

Savings accounts work on the basis of simple interest : what you get paid is a linear function of both time and money. Lend for half a year? 3% ÷ 2 = 1.5% Lend for two years? 3% ×2 = 6%

An important thing to bear in mind is that interest is paid at fixed intervals, for instance at the beginning of January. You don’t have to spend those 30 € : you can them on the savings account and earn simple interest on them after a year (3% of 30 € is 0.90 €).

Using this strategy, lending for two years is done at a 6.09% rate instead of 6%, because you get interest on interest. This is known as compound interest : what you get paid is an exponential function of time. Lend for two years ? (+3%)² = +6.09% Lend for three years ? (+3%)³ = +9,27%

The mathematical justification is that, with a 3% interest, your total amount of money is multiplied by 1.03 every year:

1,000 + 30 = 1,000 + 3% of 1,000 = 1,000 + 0.03 × 1,000 = 1.03 × 1,000

So, after two years, the amount is multiplied by 1.03 two times, and so on.

1,060.90 = 1.03 × 1,030 = 1.03 × 1.03 × 1,000

In short, percentages have a multiplicative effect.

And now, pop quiz : I’ve gained +5% weight over the winter holidays. What percentage of my weight do I have to lose to be back to normal ?

If you answered -5%, you missed the point. Multiplicative effect means the total change of weight would be +5% × -5% = 1.05 × 0.95 = 0.9975 = -0.25%. I would be losing too much weight !

The correct answer was 1 ÷ 1.05 = -4.76%.

Similarly, if the number of graduates of a given school increases by +10% on year one and +25% on year two, the total increase is +37.5% and not +35%.

Duality

This is where mathematicians (and computer scientists) use an interesting little concept called duality. Percentages are numbers that are easy to understand, but hard to combine. We can transform them into something that is a little bit harder to understand, but easier to combine.

The traditional way to transform multiplication into addition is to exponentiate, due to an interesting property of the exponential function:

exp(a) ×exp(b) = exp(a + b)

So, I wish to find a percentage operator (§) such that:

  • we conserve some values, 0§ = 0% and 100§ = 100%
  • applying A§, then B§, is equivalent to applying (A+B)§

Then this uniquely defines an operator which is called exponential percentage:

A§ = B%  ↔  A = 100 × log(1 + B ÷ 100) ÷ log(2)

Some common values:

0% = 0§ +100% = +100§ -100% = -∞§ 200% = 158.4§
+1% = +1.4§ +99% = +99.2§ -1% = -1.4§ -99% = -664§
+10% = +13.7§ +90% = +92.6§ -10% = -15.2§ -90% = -332§
+25% = +32.2§ +75% = +80.7§ +50% = +58.4§ -50% = -100§

percent

So, if I gained +5§ weight over the holidays, I can lose -5§ weight and be back to where I started, and if a number increases by 10§, then by 25§, it increases by 35§ overall.

And of course, a yearly interest rate of 4.2§ = 3% compounded over ten years is 42§ = 34%.

No Free Lunch

Normal percentage rules make compounding hard, but it’s reasonably easy to estimate a percentage based on a fraction. Exponential percentage rules make compounding easy, but evaluating a percentage based on real figures is harder.

In practice, compounding happens less often than evaluating, so humans use normal percentage rules. And computers are good at compounding through multiplication, so they don’t need exponentiation.

Duality does have some other uses, though. For instance, there’s the duality between two representations of complex numbers:

a + ib = r exp iθ

The cartesian (a,b) notation makes it easier to add numbers, but multiplication is harder:

a + ib + c + id = (a+c) + i(b+d)

The polar (r,θ) notation makes it easier to multiply numbers, but addition is harder:

r exp iθ × s exp iφ = (r × s) exp i(θ+φ)

For mathematically-oriented computer scientists, duality is a gold mine, because it lets one reduce a complex problem in one area to a simpler problem in another area (whether simpler means faster, as in the case of FFT, or easier to think about)..

The Law of DSLs

There’s one common duality that is fundamental in the computer world: the correspondence between data and code. In a fit of narcissism, let me sit wisely atop a tall mountain to announce Nicollet’s Law of Domain Specific Languages:

Any sufficiently complex data processing algorithm is as an interpreter for a small domain-specific language, and the data being processed is a program executed by the interpreter.

In some cases, this law only complicates things further. In many cases, however, the different angle it provides leads to many advantages, one of them being to transform a non-programming concept (such as an accounting file format) into a concept programmers are familiar with (a programming language).

A minimalist language design culture is enough to grasp several interesting concepts about executing code, which can be quite handy when processing data:

1. Compile to Bytecode

Interpreters don’t execute a string of characters. They tokenize that string, turn the tokens into an abstract syntax tree representing operations, functions and variables, then turn that syntax tree into a sequence of small, executable operations. That sequence is then fed into a virtual machine (or further compiled to machine code) to perform the actual operations.

If the input data for your algorithm is very complex, you can begin on the other side: what will the algorithm do with the data? Will it be inserting the data into a database? Constructing a data object from bits and pieces? What you are looking for is a set of atomic operations you can apply to generate the result. Implement these operations, then start working on a translation algorithm to turn the input data into such operations.

There are several common and friendly representations for such atomic bytecode:

Instruction lists are executed in order. This is your classic assembler listing, without the jumps. A typical “parse file and insert into database” algorithm would generate such an instruction list, and every instruction would be an INSERT, DELETE or UPDATE. Works best when you can read the data and generate the instructions in the right order: if you cannot get the list in the right order from the start, consider another approach.

Dependency graphs work like makefiles: you have several instruction lists floating around with relationships between them, indicating that one list has to be executed before another. A topological sort of the graph results in a single classic instruction list you can execute. A multi-file import, where some files contain data needed in other files, can be the way to go.

Nested scopes are the typical extension to instruction lists: every item in a list can be either an instruction, or another list, possibly tagged with some data. This could be a conditional (if this condition is true, execute this list), a loop (though it is best to avoid these) or a context (a “polygon” scope contains “insert vertex” operations that apply to that polygon). You can even allow variables in a let-in fashion (of which the polygon example above is just a special case) ! Note that nested scopes can be easily represented as XML.

2. Static Analysis

A side-effect of compiling to bytecode is that you get to process the entire file before you actually perform the intended operations. This makes a rollback easier if you notice that there’s an error on the last line of the file: if you make sure that no atomic operation in your target language can fail due to bad input (such as incorrect data values), then you can check your input data for correctness without doing anything to your program state.

Even better, if your compilation process is cheap (linearly traverse a file for parsing) and you have heuristics for predicting how much time and resources your individual instructions require, then you can try to accurately predict the needs of the entire process.

Static analysis also means you can optimize. If, for instance, you’re inserting data into a database and need to resolve names or keys frequently (such as “add this item to list #732″), you can easily construct a table of needed keys (that you can get in one query when the processing starts) using the dependency graph approach.You can also optimize resource allocation by using common register allocation techniques: sort your dependency graph to keep as few resources in memory as possible at any given time.

3. Caching

Try to perform most of the processing offline.

For instance, if you frequently “apply” one file to another, such as a nearly-constant “list of categories” file used to resolve the “category” key in a daily object import, you can benefit from compiling the nearly-constant file to an easily loaded, easily applied format.

You see a cached dictionary that maps keys to categories? I see a DSL that allows dictionary literals as part of the language, and a source file that contains a literal mapping keys to categories, with an interpreter that can apply constant propagation to dictionaries.

Another benefit is when applying changes to mission-critical software. Inserting lots of data into a web database can create a heavy load on the server and make the site unavailable to visitors. It might therefore be preferrable to pre-compile the imported data into requests through a process that keeps a light load on the server, then run the requests.

Besides, with proper nested scoping, you can slice an import into several transactions. This keeps the lock count low, allows spreading the transactions over time to reduce the load, and lets you resume the import process if, for some reason, it gets interrupted.

Oracle Nonogram

A nonogram is a number and paint puzzle. The player is given a grid of cells, some of which must be painted black. For every row and every column, the player is given the number of contiguous black segments, as well as the number of cells in each. This lets the player deduce the position of black cells.

Nonogram being solved (courtesy of Wikipedia)

Originally a pen-and-paper game, there have been several implementations as video games, notably by Nintendo. The online gaming website ArmorGames proposes a variant of the game: when you paint a tile that should not be painted, the game notifies you that the tile is actually unpainted and you lose one life—you lose after five misses.

Armor Picross 2 : an Oracle Nonogram

Armor Picross 2 : an Oracle Nonogram

The Armor Games version is an oracle nonogram: there is an all-knowing oracle that can answer truthfully a certain set of questions about the solution that cannot be determined easily (or at all).

The original version of the nonogram game can be solved by the computer in NP time (and some specific classes of nonograms can be solved in polynomial time). Can the oracle nonogram game be solved faster?

No. In fact, there is no general solution for the oracle nonogram. The wording on the oracle introduces a slight difference between the two problems: any way of painting blocks that is compatible with the constraints is a solution to the original nonogram game, but only one way of painting the blocks is a solution to the oracle nonogram (and it is against this solution that the oracle nonogram checks any paint attempts). So if there are several possible solutions to a given non-oracle nonogram, adding an oracle forces the player to ask the oracle the correct questions to determine which of the several solutions is the correct one.

In an NxN nonogram where each row and each column contains a single painted cell, there are N! different solutions: the top-left to bottom-right diagonal is an obvious solution, and any other solution can be obtained from that diagonal by shuffling the coumns. In the N=15 example above, the number of solutions:

15! = 1 307 674 368 000

You can get as many guesses as you want to find the correct solution, but you lose after five incorrect guesses. However, for any guessing strategy, I can create at least two solutions that guarantee the first four guesses will be incorrect (this is only possible if N ≥ 3, of course)…

2D Transforms

A point in a 2D space is commonly identified as an (x,y) pair, representing its distance to the origin along the x-axis and the y-axis. Objects in 2D space have two additional properties that make no sense for mere points: scale and angle (I will ignore here shearing and anisotropic scaling).

A square, scaled and rotated.

A square, scaled and rotated.

An obvious question is, how to represent such transformations in a computer program, such as a video game? The simplest way one could think of is to use a structure that contains:

struct position
{
  float x;
  float y;
  float angle;
  float scale;
};

This works in most situations. Most simple 2D video games will be able to get away with this simple version without much difficulty. But there is one thing that it cannot do easily.

Relative Transforms

Suppose for a moment that the object we’re moving around is a spaceship. That spaceship has one gun at the end of each wing, which means positions (-2,0) and (2,0) relative to the center of the spaceship. Now, the ship is placed at a certain position, rotated and scaled. Where should the bullets be fired from?

What is the absolute position and angle of the guns?

What is the absolute position and angle of the guns?

On the one hand, we have an absolute position (where the ship is, its angle and its scale) and on the other hand we have a relative position (where the gun is, relative to the center of the ship). The objective is to determine the absolute position of the gun. Mathematically, if an object is at position (a,b) relative to an object that is at position (x,y) with scale s and angle θ, then its absolute position (a,’b) is:

a’ = x + s a cos θs b sin θ
b’ = y + s b
cos θ + s a sin θ

But what happens if the ship itself is part of a greater whole? Or what if the gun itself is in fact a turret that can rotate independently around a point on the wing?

To handle this, we define a transform as “something that moves an object from a certain position, angle and scale (the source) to another position, angle and scale (the destination)”. The ship has an absolute transform, which moves it from point (0,0), scale 100%, angle 0° to its actual position, scale and angle. The gun has a relative transform, which moves it from the ship’s position, scale and angle to its actual position, scale and angle.

So, the absolute transform of a gun is obtained by applying the absolute transform (move the ship to where it should be) then the relative transform (move the gun to where it should be based on the ship’s position). And once you have the absolute transform of the gun, you can deduce its position, scale and angle.

The problem is that using the above “position, scale, angle” representation for transforms is not handy, because applying two transforms one after another gets complicated very fast. One solution is to use a matrix, which is a commonly used tool for representing transforms. Another solution is using complex numbers.

Complex Numbers

A complex number is an object of the form a + ib, where a is the real part and b is the imaginary part. Complex number multiplication uses the property that i² = -1. From there, we deduce the rules for complex number addition and multiplication:

(a + ib) + (a’ + ib’) = (a + a’) + i(b + b’)

(a + ib) (a’ + ib’) = (aa’ – bb’) + i(ab’ + a’b)

So, how does this help us? Well, consider our original position/angle/scale approach, call these x, y, θ and s and build the complex number function:

t(z) = s (cos θ + i sin θ) z + (x + iy)

This function represents a relative transform. Calling it on a complex number (a + ib) will return a complex number (a’ + ib’) such that (a’,b’) will be the absolute position of a point that was at relative position (a,b) from the center of the transformed object:

a’ = x + s a cos θs b sin θ
b’ = y + s b
cos θ + s a sin θ

These are the exact formulas we discussed above!

So now, applying two transforms in a row is easy. If gun(z) is the he transform of the gun on the plane’s wing, and plane(z) is the transform of the plane in the world, then the transform of the gun in the world is plane(gun(z)). And since the functions are actually linear complex functions, composing them is exceedingly easy:

t(z) = a z + b
t’(z) = a
z + b’
t’(t(z)) = a’
(a z + b) + b’ = aa’ z + (a’b + b’)

Or, in terms of C++:

struct transform
{
  std::complex<double> a, b;

  transform(double x, double y, double angle = 0.0, double scale = 1.0)
    : a(scale * std::cos(angle), scale * std::sin(angle)), b(x, y)  {}

  transform &operator *=(const transform& o)
  {
    b += a * o.b;
    a *= o.a;
    return *this;
  }

  double x() const { return b.real(); }
  double y() const { return b.imag(); }
  double angle() const { return std::arg(a); }
  double scale() const { return std::abs(a); }
};

transform operator*(const transform &f, const transform &g)
{
  transform result(f);
  return f *= g;
}

Here, t’ * t represents the transform t’(t(z)).

Heaps of Knowledge

A question I like to ask for evaluating programming skill is, “Implement Heap Sort” (in whatever language I need programming in).

Heap sort looks something like this (the example is in C++, but the implementation is fairly similar in just about any imperative language):

void swap(int &a, int &b) {
  int c = a;
  a = b;
  b = c;
}

void sift(int *data, int i, int size) {
  int l = i * 2 + 1, r = l + 1;
  int argmax = i;
  if (l < size && data[argmax] < data[l]) argmax = l;
  if (r < size && data[argmax] < data[r]) argmax = r;
  if (argmax != i) {
    swap(data[i], data[argmax]);
    sift(data, argmax, size);
  }
}

void heapsort(int *data, int size) {
  for (int i = (size-1)/2; i >= 0; --i) sift(data,i,size);
  for (int i = size; i >= 2; --i) {
    swap(data[i-1],data[0]);
    sift(data,0,i-1);
  }
}

This short version requires some experience to get right, as well as knowledge of the algorithm. I suspect that a large number of programmers have never heard of heap sort, and of those who have heard of it, few will know how to implement it on their own. Of course, if you ask someone who knows how to write, you will usually get a pleasant surprise. But most of the time, you’ll be facing someone who just doesn’t know the answer. That’s the entire point: what people know is important, but it’s far more important to know how people will react when they don’t know (and especially, how quickly they can learn).

If the programmer doesn’t know what heap sort is, you can nudge him in the right direction by explaining how heap sort works. Basically, heap sort is an improvement over the naive selection sort: instead of simply extracting the maximum element from the unsorted area in the array, heap sort turns the unsorted area into a heap—a binary tree where the root is the maximum element of the tree and each sub-tree is also a heap—thereby making extraction of a maximum a Θ(log n) task instead of the usual Θ(n). The trick behind heap sort is that the unsorted area of the array can be seen as a heap, by deciding that every index in the array is a node in the tree, with the left child being at index (2n+1) and the right child being at index (2n+2).

From there, it’s fairly elementary to deduce that the algorithm should first turn the entire array into a heap, then remove the first element, restore the heap, and repeat the process until the heap is empty. The sift function itself should be guessed fairly easily by anyone with any experience in recursive functions (it is, after all, merely a question of constructing a large heap from two smaller heaps).

The benefits I see to the question are:

  • See how the interviewee or student handles a situation where he has no idea about how to answer a question. Do they stare without saying a word? Do they get angry at an impossible question? Or do they admit they don’t know, and try to gather more information?
  • Determine how much general Computer Science knowledge the candidate has. As you explain Heap Sort, do they already know what heaps and selection sort are? Do they have to ask what a binary tree is? Do they have to ask what an array or a sort is, or how to swap two values?
  • Detect positive laziness and knowledge of standard libraries. Does the candidate mention they’d use the standard library sort in production code? In C++, do they already know of std::make_heap or std::swap?
  • Test basic knowledge of recursion and iteration. Does the interviewee ‘get’ the principle of sifting on the first try? Are the bounds for the two for-loops correct? Can the interviewee evaluate the complexity of the function?
  • Test basic knowledge of the language idioms. Are the functions declared correctly? Are the control structures used correctly? Is the choice of a storage type appropriate for the array?