Tag Archive for 'Algorithms'

P = NP for Normal People

A few days ago, an Indian computer scientist working for HP has announced that he has proven that P ≠ NP, solving the decades-long search for an answer to the P = NP question. A lot of computer scientists, me included, are quite excited over this, and waiting to see whether the proof actually holds (it’s long and complex enough that only people with a lot of spare time on their hands can actually determine if it’s correct). To us, this is the equivalent of finding life in outer space or learning who killed JFK.

Why are we so excited? Here’s a short explanation for normal people:

Linear Complexity

Suppose you’re in the outsourcing business, and all the world’s spell-checking algorithms have mysteriously disappeared overnight. Being the clever businessman that you always were, you hire a hundred workers from India, buy them a hundred English dictionaries, and start sending them pieces of text with instructions to reply with a list of words that were not in the dictionary.

Having a minimum-wage offshore worker look up a word in a dictionary costs you $0.01; this means that if your customers pay you $0.02 / word, you’re a winner regardless of how large the text is. In essence, spell-checking a text has a linear complexity, because the price as a function of the text size is a line:

linear

Polynomial Complexity

Not everything is linear. Suppose now that you have a different business: big corporations and political parties give you a list of things, and you sort them by popular preference. For instance, you could determine that Pepsi > Coke Light > Coke. You do this by polling people. A “pick one” poll costs you $100 to set up and execute.

If asked to sort two items (Pepsi and Coke), you would need only one poll – $60 per item is a good price (you earn $120 and pay $100).

If asked to sort three items (Pepsi, Coke and Coke Light), one “pick one” poll is not enough : what if everyone loves Pepsi, and there’s not enough answers to order Coke and Coke Light? You need two polls. Now $60/item runs you into the red (you earn $180 and pay $100). So, you increase your price to $100/item just to be sure you’ll never be in the negative. Right?

Well, there are exactly 24 ways of ordering a list of four items:

1 2 3 4 2 3 4 1 3 4 1 2 4 1 2 3
2 1 3 4 1 3 4 2 3 4 2 1 4 2 1 3
3 2 1 4 2 1 4 3 1 4 3 2 4 3 2 1
4 2 3 1 2 3 1 4 3 1 4 2 1 4 2 3
1 3 2 4 3 2 4 1 2 4 1 3 4 1 3 2
1 2 4 3 2 4 3 1 4 3 1 2 3 1 2 4

On the other hand, if you run four “pick A or B” polls, you only get 16 different outcomes:

A A A A A A A B A A B A A A B B
A B A A A B A B A B B A A B B B
B A A A B A A B B A B A B A B B
B B A A B B A B B B B A B B B B

So, you have a testing strategy that can give you only 16 different results, and based on those results you have to find which one of 24 situations you are in. This is impossible. You need a fifth test, which will increase the number of results to 32. Now you’re earning $400 but paying $500…

Adding an item to the list multiplies the number of permutations by the number of elements in the list: adding a tenth item multiplies it by ten. On the other hand, adding a new poll only multiplies the number of results by two. This means that the number of polls will increase faster than the number of items. No matter what per-item price you select, you will run out of money sooner or later. Keep this in mind: it’s going to be very important soon.

This plot represents cost (red) and earnings (blue) as a function of the number of items, for a per-item price of $1,000 :

nlogn

The set of all problems that can be solved for a profit, regardless of size, at a per-item price, is the set of problems with linear cost; in computer science lingo, we write this cost as O(n), meaning that you can find a constant (here, the price) such that the cost will be smaller than n times this constant for a problem of size n. Looking for words in a dictionary is a known O(n) problem. Sorting a list of items by comparing them to each other has been proven not to be an O(n) problem (and the description above is a summary of the proof).

Suppose you go for a different pricing method: to sort a list of n items, you ask for $100 × n × n. Two items is $400 (cost: $100), three items is $900 (cost: $200), four items is $1600 (cost $500). You are guaranteed to make a profit because the worst that could happen is that you compare every item to every other item, which means polls, and you would still get enough to pay for all of them.

Sorting a list of n items is hence known to be an O(n²) problem.

Remember the definition of linear cost? There exists a constant k such that the cost of working with n items is smaller than than k × n. There is a similar definition for polynomial cost: there exists a constant k such that the cost of working with n items is smaller than nkn multiplied by itself k times. All linear-cost problems are, by definition, polynomial-cost problems as well.

In the expression “P = NP”, the “P” represents all problems that can be solved at a Polynomial cost.

Non-Deterministic Polynomial Complexity

Let’s think of another business idea. Now, you’re a tour optimizer: salesmen give you a list of cities they need to travel through to sell their wares, and you provide them with a sequence of visits that has the smallest travel costs and never goes more than once through any given city. This is a common toy problem in computer science for which no polynomial-cost solution is known—The Traveling Salesman Problem.

As a quick comparison:

  • Spell-checking a 10-word text involves 10 dictionary searches – O(n)
  • A naive sort of 10 elements takes, at most, 100 comparisons – O(n²)
  • The traveling salesman problem with 10 cities involves testing 3,628,800 paths.

So, you set up a contest: you hand out the problem to a billion minimum wage workers, and you only pay the worker who found the shortest travel path. So, each of them starts computing the length of a certain path, and there are so many workers that every path will be worked on. After a while, they all submit their solutions, and you keep the shortest one.

Each worker spent a polynomial time selecting a travel path and computing its length: they only had to add the distances between consecutive cities in the travel path. So, the cost of solving the problem this way is polynomial. The important thing here is that you did not know what worker you would pay until the answers came in. In computer science, we say that your solving strategy is non-deterministic.

In the expression “P = NP”, the “NP” represents all problems that can be solved at a polynomial cost using a non-deterministic solving strategy — Non-deterministic Polynomial.

P = NP ?

Any P problem is also an NP problem : if you can solve it by paying someone, you can solve it by holding a contest and paying the winner that same amount. The question is whether any NP problem is also a P problem (hence, P = NP) : if you can hold a contest to solve the problem, can you hire someone to do the same while retaining a polynomial cost?

This question is important because our current computing technology (electricity-through-silicon-transistors) does not support non-deterministic computing. So, if we could find efficient deterministic solutions to every problem inside NP, we would be quite happy about it.

Actually… we wouldn’t. Many security features in computer science rely on problems such that 1° solving the problem is very complex but 2° checking whether a solution is correct is very simple. For instance, checking that a password is correct is a P problem (your computer does it several times a day), but finding the password is an NP problem (you have to try all passwords). If a P solution could be found to the “find the password” problem, we would be in trouble.

Turning recursion into loops

This message is a rehash of an earlier post on Stack Overflow. It’s self-sufficient enough to be featured here, and I like to keep my content where I can see it ;) If you’re not a computer genius, you can wait until next Monday for an article that can be read by standard issue humans.

The premise is to take a standard tree traversal function (which is a recursive algorithm) and implement it without recursion. The recursive version is:

traverse(node):
  if node != None do:
    traverse(node.left)
    print node.value
    traverse(node.right)
  endif

Try doing it on your own, then start reading the solution below once your brain is fried.

The solution

This is a clear case of tail recursion, so you can easily turn it into a while-loop.

traverse(node):
  while node != None do:
    traverse(node.left)
    print node.value
    node = node.right
  endwhile

You’re left with a recursive call. What the recursive call does is push a new context on the stack, run the code from the beginning, then retrieve the context and keep doing what it was doing. So, you create a stack for storage, and a loop that determines, on every iteration, whether we’re in a “first run” situation (non-null node) or a “returning” situation (null node, non-empty stack) and runs the appropriate code:

traverse(node):
  stack = []
  while !empty(stack) || node != None do:
    if node != None do: // this is a normal call, recurse
      push(stack,node)
      node = node.left
    else // we're returning: pop and print the current node
      node = pop(stack)
      print node.value
      node = node.right
    endif
  endwhile

The hard thing to grasp is the “return” part: you have to determine, in your loop, whether the code you’re running is in the “entering the function” situation or in the “returning from a call” situation, and you will have an if/else chain with as many cases as you have non-terminal recursions in your code.

In this specific situation, we’re using the node to keep information about the situation. Another way would be to store that in the stack itself (just like a computer does for recursion). With that technique, the code is less optimal, but easier to follow

traverse(node):
  // entry:
  if node == NULL do return
  traverse(node.left)
  // after-left-traversal:
  print node.value
  traverse(node.right)

traverse(node):
   stack = [node,'entry']
   while !empty(stack) do:
     [node,state] = pop(stack)
     switch state:
       case 'entry':
         if node == None do: break // return
         push(stack,[node,'after-left-traversal']) // store return address
         push(stack,[node.left,'entry']) // recursive call
         break
       case 'after-left-traversal':
         print node.value;
         // tail call : no return address
         push(stack,[node.right,'entry']) // recursive call
      end
    endwhile 

Interest(ing) rates

The most common way of investing money is putting it in a savings account. You lend a fixed amount of money to someone, and they pay interest over that money at a predetermined rate. Let’s say you lend 1,000 € at an interest rate of 3%, paid every year: at the end of the year, you would receive 30 € as payment for your lending. You would spend these on fine wine or nice clothes and wait until the next year to get another 30 €, and so on.

Savings accounts work on the basis of simple interest : what you get paid is a linear function of both time and money. Lend for half a year? 3% ÷ 2 = 1.5% Lend for two years? 3% ×2 = 6%

An important thing to bear in mind is that interest is paid at fixed intervals, for instance at the beginning of January. You don’t have to spend those 30 € : you can them on the savings account and earn simple interest on them after a year (3% of 30 € is 0.90 €).

Using this strategy, lending for two years is done at a 6.09% rate instead of 6%, because you get interest on interest. This is known as compound interest : what you get paid is an exponential function of time. Lend for two years ? (+3%)² = +6.09% Lend for three years ? (+3%)³ = +9,27%

The mathematical justification is that, with a 3% interest, your total amount of money is multiplied by 1.03 every year:

1,000 + 30 = 1,000 + 3% of 1,000 = 1,000 + 0.03 × 1,000 = 1.03 × 1,000

So, after two years, the amount is multiplied by 1.03 two times, and so on.

1,060.90 = 1.03 × 1,030 = 1.03 × 1.03 × 1,000

In short, percentages have a multiplicative effect.

And now, pop quiz : I’ve gained +5% weight over the winter holidays. What percentage of my weight do I have to lose to be back to normal ?

If you answered -5%, you missed the point. Multiplicative effect means the total change of weight would be +5% × -5% = 1.05 × 0.95 = 0.9975 = -0.25%. I would be losing too much weight !

The correct answer was 1 ÷ 1.05 = -4.76%.

Similarly, if the number of graduates of a given school increases by +10% on year one and +25% on year two, the total increase is +37.5% and not +35%.

Duality

This is where mathematicians (and computer scientists) use an interesting little concept called duality. Percentages are numbers that are easy to understand, but hard to combine. We can transform them into something that is a little bit harder to understand, but easier to combine.

The traditional way to transform multiplication into addition is to exponentiate, due to an interesting property of the exponential function:

exp(a) ×exp(b) = exp(a + b)

So, I wish to find a percentage operator (§) such that:

  • we conserve some values, 0§ = 0% and 100§ = 100%
  • applying A§, then B§, is equivalent to applying (A+B)§

Then this uniquely defines an operator which is called exponential percentage:

A§ = B%  ↔  A = 100 × log(1 + B ÷ 100) ÷ log(2)

Some common values:

0% = 0§ +100% = +100§ -100% = -∞§ 200% = 158.4§
+1% = +1.4§ +99% = +99.2§ -1% = -1.4§ -99% = -664§
+10% = +13.7§ +90% = +92.6§ -10% = -15.2§ -90% = -332§
+25% = +32.2§ +75% = +80.7§ +50% = +58.4§ -50% = -100§

percent

So, if I gained +5§ weight over the holidays, I can lose -5§ weight and be back to where I started, and if a number increases by 10§, then by 25§, it increases by 35§ overall.

And of course, a yearly interest rate of 4.2§ = 3% compounded over ten years is 42§ = 34%.

No Free Lunch

Normal percentage rules make compounding hard, but it’s reasonably easy to estimate a percentage based on a fraction. Exponential percentage rules make compounding easy, but evaluating a percentage based on real figures is harder.

In practice, compounding happens less often than evaluating, so humans use normal percentage rules. And computers are good at compounding through multiplication, so they don’t need exponentiation.

Duality does have some other uses, though. For instance, there’s the duality between two representations of complex numbers:

a + ib = r exp iθ

The cartesian (a,b) notation makes it easier to add numbers, but multiplication is harder:

a + ib + c + id = (a+c) + i(b+d)

The polar (r,θ) notation makes it easier to multiply numbers, but addition is harder:

r exp iθ × s exp iφ = (r × s) exp i(θ+φ)

For mathematically-oriented computer scientists, duality is a gold mine, because it lets one reduce a complex problem in one area to a simpler problem in another area (whether simpler means faster, as in the case of FFT, or easier to think about)..

The Law of DSLs

There’s one common duality that is fundamental in the computer world: the correspondence between data and code. In a fit of narcissism, let me sit wisely atop a tall mountain to announce Nicollet’s Law of Domain Specific Languages:

Any sufficiently complex data processing algorithm is as an interpreter for a small domain-specific language, and the data being processed is a program executed by the interpreter.

In some cases, this law only complicates things further. In many cases, however, the different angle it provides leads to many advantages, one of them being to transform a non-programming concept (such as an accounting file format) into a concept programmers are familiar with (a programming language).

A minimalist language design culture is enough to grasp several interesting concepts about executing code, which can be quite handy when processing data:

1. Compile to Bytecode

Interpreters don’t execute a string of characters. They tokenize that string, turn the tokens into an abstract syntax tree representing operations, functions and variables, then turn that syntax tree into a sequence of small, executable operations. That sequence is then fed into a virtual machine (or further compiled to machine code) to perform the actual operations.

If the input data for your algorithm is very complex, you can begin on the other side: what will the algorithm do with the data? Will it be inserting the data into a database? Constructing a data object from bits and pieces? What you are looking for is a set of atomic operations you can apply to generate the result. Implement these operations, then start working on a translation algorithm to turn the input data into such operations.

There are several common and friendly representations for such atomic bytecode:

Instruction lists are executed in order. This is your classic assembler listing, without the jumps. A typical “parse file and insert into database” algorithm would generate such an instruction list, and every instruction would be an INSERT, DELETE or UPDATE. Works best when you can read the data and generate the instructions in the right order: if you cannot get the list in the right order from the start, consider another approach.

Dependency graphs work like makefiles: you have several instruction lists floating around with relationships between them, indicating that one list has to be executed before another. A topological sort of the graph results in a single classic instruction list you can execute. A multi-file import, where some files contain data needed in other files, can be the way to go.

Nested scopes are the typical extension to instruction lists: every item in a list can be either an instruction, or another list, possibly tagged with some data. This could be a conditional (if this condition is true, execute this list), a loop (though it is best to avoid these) or a context (a “polygon” scope contains “insert vertex” operations that apply to that polygon). You can even allow variables in a let-in fashion (of which the polygon example above is just a special case) ! Note that nested scopes can be easily represented as XML.

2. Static Analysis

A side-effect of compiling to bytecode is that you get to process the entire file before you actually perform the intended operations. This makes a rollback easier if you notice that there’s an error on the last line of the file: if you make sure that no atomic operation in your target language can fail due to bad input (such as incorrect data values), then you can check your input data for correctness without doing anything to your program state.

Even better, if your compilation process is cheap (linearly traverse a file for parsing) and you have heuristics for predicting how much time and resources your individual instructions require, then you can try to accurately predict the needs of the entire process.

Static analysis also means you can optimize. If, for instance, you’re inserting data into a database and need to resolve names or keys frequently (such as “add this item to list #732″), you can easily construct a table of needed keys (that you can get in one query when the processing starts) using the dependency graph approach.You can also optimize resource allocation by using common register allocation techniques: sort your dependency graph to keep as few resources in memory as possible at any given time.

3. Caching

Try to perform most of the processing offline.

For instance, if you frequently “apply” one file to another, such as a nearly-constant “list of categories” file used to resolve the “category” key in a daily object import, you can benefit from compiling the nearly-constant file to an easily loaded, easily applied format.

You see a cached dictionary that maps keys to categories? I see a DSL that allows dictionary literals as part of the language, and a source file that contains a literal mapping keys to categories, with an interpreter that can apply constant propagation to dictionaries.

Another benefit is when applying changes to mission-critical software. Inserting lots of data into a web database can create a heavy load on the server and make the site unavailable to visitors. It might therefore be preferrable to pre-compile the imported data into requests through a process that keeps a light load on the server, then run the requests.

Besides, with proper nested scoping, you can slice an import into several transactions. This keeps the lock count low, allows spreading the transactions over time to reduce the load, and lets you resume the import process if, for some reason, it gets interrupted.

Oracle Nonogram

A nonogram is a number and paint puzzle. The player is given a grid of cells, some of which must be painted black. For every row and every column, the player is given the number of contiguous black segments, as well as the number of cells in each. This lets the player deduce the position of black cells.

Nonogram being solved (courtesy of Wikipedia)

Originally a pen-and-paper game, there have been several implementations as video games, notably by Nintendo. The online gaming website ArmorGames proposes a variant of the game: when you paint a tile that should not be painted, the game notifies you that the tile is actually unpainted and you lose one life—you lose after five misses.

Armor Picross 2 : an Oracle Nonogram

Armor Picross 2 : an Oracle Nonogram

The Armor Games version is an oracle nonogram: there is an all-knowing oracle that can answer truthfully a certain set of questions about the solution that cannot be determined easily (or at all).

The original version of the nonogram game can be solved by the computer in NP time (and some specific classes of nonograms can be solved in polynomial time). Can the oracle nonogram game be solved faster?

No. In fact, there is no general solution for the oracle nonogram. The wording on the oracle introduces a slight difference between the two problems: any way of painting blocks that is compatible with the constraints is a solution to the original nonogram game, but only one way of painting the blocks is a solution to the oracle nonogram (and it is against this solution that the oracle nonogram checks any paint attempts). So if there are several possible solutions to a given non-oracle nonogram, adding an oracle forces the player to ask the oracle the correct questions to determine which of the several solutions is the correct one.

In an NxN nonogram where each row and each column contains a single painted cell, there are N! different solutions: the top-left to bottom-right diagonal is an obvious solution, and any other solution can be obtained from that diagonal by shuffling the coumns. In the N=15 example above, the number of solutions:

15! = 1 307 674 368 000

You can get as many guesses as you want to find the correct solution, but you lose after five incorrect guesses. However, for any guessing strategy, I can create at least two solutions that guarantee the first four guesses will be incorrect (this is only possible if N ≥ 3, of course)…

2D Transforms

A point in a 2D space is commonly identified as an (x,y) pair, representing its distance to the origin along the x-axis and the y-axis. Objects in 2D space have two additional properties that make no sense for mere points: scale and angle (I will ignore here shearing and anisotropic scaling).

A square, scaled and rotated.

A square, scaled and rotated.

An obvious question is, how to represent such transformations in a computer program, such as a video game? The simplest way one could think of is to use a structure that contains:

struct position
{
  float x;
  float y;
  float angle;
  float scale;
};

This works in most situations. Most simple 2D video games will be able to get away with this simple version without much difficulty. But there is one thing that it cannot do easily.

Relative Transforms

Suppose for a moment that the object we’re moving around is a spaceship. That spaceship has one gun at the end of each wing, which means positions (-2,0) and (2,0) relative to the center of the spaceship. Now, the ship is placed at a certain position, rotated and scaled. Where should the bullets be fired from?

What is the absolute position and angle of the guns?

What is the absolute position and angle of the guns?

On the one hand, we have an absolute position (where the ship is, its angle and its scale) and on the other hand we have a relative position (where the gun is, relative to the center of the ship). The objective is to determine the absolute position of the gun. Mathematically, if an object is at position (a,b) relative to an object that is at position (x,y) with scale s and angle θ, then its absolute position (a,’b) is:

a’ = x + s a cos θs b sin θ
b’ = y + s b
cos θ + s a sin θ

But what happens if the ship itself is part of a greater whole? Or what if the gun itself is in fact a turret that can rotate independently around a point on the wing?

To handle this, we define a transform as “something that moves an object from a certain position, angle and scale (the source) to another position, angle and scale (the destination)”. The ship has an absolute transform, which moves it from point (0,0), scale 100%, angle 0° to its actual position, scale and angle. The gun has a relative transform, which moves it from the ship’s position, scale and angle to its actual position, scale and angle.

So, the absolute transform of a gun is obtained by applying the absolute transform (move the ship to where it should be) then the relative transform (move the gun to where it should be based on the ship’s position). And once you have the absolute transform of the gun, you can deduce its position, scale and angle.

The problem is that using the above “position, scale, angle” representation for transforms is not handy, because applying two transforms one after another gets complicated very fast. One solution is to use a matrix, which is a commonly used tool for representing transforms. Another solution is using complex numbers.

Complex Numbers

A complex number is an object of the form a + ib, where a is the real part and b is the imaginary part. Complex number multiplication uses the property that i² = -1. From there, we deduce the rules for complex number addition and multiplication:

(a + ib) + (a’ + ib’) = (a + a’) + i(b + b’)

(a + ib) (a’ + ib’) = (aa’ – bb’) + i(ab’ + a’b)

So, how does this help us? Well, consider our original position/angle/scale approach, call these x, y, θ and s and build the complex number function:

t(z) = s (cos θ + i sin θ) z + (x + iy)

This function represents a relative transform. Calling it on a complex number (a + ib) will return a complex number (a’ + ib’) such that (a’,b’) will be the absolute position of a point that was at relative position (a,b) from the center of the transformed object:

a’ = x + s a cos θs b sin θ
b’ = y + s b
cos θ + s a sin θ

These are the exact formulas we discussed above!

So now, applying two transforms in a row is easy. If gun(z) is the he transform of the gun on the plane’s wing, and plane(z) is the transform of the plane in the world, then the transform of the gun in the world is plane(gun(z)). And since the functions are actually linear complex functions, composing them is exceedingly easy:

t(z) = a z + b
t’(z) = a
z + b’
t’(t(z)) = a’
(a z + b) + b’ = aa’ z + (a’b + b’)

Or, in terms of C++:

struct transform
{
  std::complex<double> a, b;

  transform(double x, double y, double angle = 0.0, double scale = 1.0)
    : a(scale * std::cos(angle), scale * std::sin(angle)), b(x, y)  {}

  transform &operator *=(const transform& o)
  {
    b += a * o.b;
    a *= o.a;
    return *this;
  }

  double x() const { return b.real(); }
  double y() const { return b.imag(); }
  double angle() const { return std::arg(a); }
  double scale() const { return std::abs(a); }
};

transform operator*(const transform &f, const transform &g)
{
  transform result(f);
  return f *= g;
}

Here, t’ * t represents the transform t’(t(z)).

Heaps of Knowledge

A question I like to ask for evaluating programming skill is, “Implement Heap Sort” (in whatever language I need programming in).

Heap sort looks something like this (the example is in C++, but the implementation is fairly similar in just about any imperative language):

void swap(int &a, int &b) {
  int c = a;
  a = b;
  b = c;
}

void sift(int *data, int i, int size) {
  int l = i * 2 + 1, r = l + 1;
  int argmax = i;
  if (l < size && data[argmax] < data[l]) argmax = l;
  if (r < size && data[argmax] < data[r]) argmax = r;
  if (argmax != i) {
    swap(data[i], data[argmax]);
    sift(data, argmax, size);
  }
}

void heapsort(int *data, int size) {
  for (int i = (size-1)/2; i >= 0; --i) sift(data,i,size);
  for (int i = size; i >= 2; --i) {
    swap(data[i-1],data[0]);
    sift(data,0,i-1);
  }
}

This short version requires some experience to get right, as well as knowledge of the algorithm. I suspect that a large number of programmers have never heard of heap sort, and of those who have heard of it, few will know how to implement it on their own. Of course, if you ask someone who knows how to write, you will usually get a pleasant surprise. But most of the time, you’ll be facing someone who just doesn’t know the answer. That’s the entire point: what people know is important, but it’s far more important to know how people will react when they don’t know (and especially, how quickly they can learn).

If the programmer doesn’t know what heap sort is, you can nudge him in the right direction by explaining how heap sort works. Basically, heap sort is an improvement over the naive selection sort: instead of simply extracting the maximum element from the unsorted area in the array, heap sort turns the unsorted area into a heap—a binary tree where the root is the maximum element of the tree and each sub-tree is also a heap—thereby making extraction of a maximum a Θ(log n) task instead of the usual Θ(n). The trick behind heap sort is that the unsorted area of the array can be seen as a heap, by deciding that every index in the array is a node in the tree, with the left child being at index (2n+1) and the right child being at index (2n+2).

From there, it’s fairly elementary to deduce that the algorithm should first turn the entire array into a heap, then remove the first element, restore the heap, and repeat the process until the heap is empty. The sift function itself should be guessed fairly easily by anyone with any experience in recursive functions (it is, after all, merely a question of constructing a large heap from two smaller heaps).

The benefits I see to the question are:

  • See how the interviewee or student handles a situation where he has no idea about how to answer a question. Do they stare without saying a word? Do they get angry at an impossible question? Or do they admit they don’t know, and try to gather more information?
  • Determine how much general Computer Science knowledge the candidate has. As you explain Heap Sort, do they already know what heaps and selection sort are? Do they have to ask what a binary tree is? Do they have to ask what an array or a sort is, or how to swap two values?
  • Detect positive laziness and knowledge of standard libraries. Does the candidate mention they’d use the standard library sort in production code? In C++, do they already know of std::make_heap or std::swap?
  • Test basic knowledge of recursion and iteration. Does the interviewee ‘get’ the principle of sifting on the first try? Are the bounds for the two for-loops correct? Can the interviewee evaluate the complexity of the function?
  • Test basic knowledge of the language idioms. Are the functions declared correctly? Are the control structures used correctly? Is the choice of a storage type appropriate for the array?


693 feed subscribers
(readers who polled a feed this week)