P = NP for Normal People

A few days ago, an Indian computer scientist working for HP has announced that he has proven that P ≠ NP, solving the decades-long search for an answer to the P = NP question. A lot of computer scientists, me included, are quite excited over this, and waiting to see whether the proof actually holds (it’s long and complex enough that only people with a lot of spare time on their hands can actually determine if it’s correct). To us, this is the equivalent of finding life in outer space or learning who killed JFK.

Why are we so excited? Here’s a short explanation for normal people:

Linear Complexity

Suppose you’re in the outsourcing business, and all the world’s spell-checking algorithms have mysteriously disappeared overnight. Being the clever businessman that you always were, you hire a hundred workers from India, buy them a hundred English dictionaries, and start sending them pieces of text with instructions to reply with a list of words that were not in the dictionary.

Having a minimum-wage offshore worker look up a word in a dictionary costs you $0.01; this means that if your customers pay you $0.02 / word, you’re a winner regardless of how large the text is. In essence, spell-checking a text has a linear complexity, because the price as a function of the text size is a line:

linear

Polynomial Complexity

Not everything is linear. Suppose now that you have a different business: big corporations and political parties give you a list of things, and you sort them by popular preference. For instance, you could determine that Pepsi > Coke Light > Coke. You do this by polling people. A “pick one” poll costs you $100 to set up and execute.

If asked to sort two items (Pepsi and Coke), you would need only one poll – $60 per item is a good price (you earn $120 and pay $100).

If asked to sort three items (Pepsi, Coke and Coke Light), one “pick one” poll is not enough : what if everyone loves Pepsi, and there’s not enough answers to order Coke and Coke Light? You need two polls. Now $60/item runs you into the red (you earn $180 and pay $100). So, you increase your price to $100/item just to be sure you’ll never be in the negative. Right?

Well, there are exactly 24 ways of ordering a list of four items:

1 2 3 4 2 3 4 1 3 4 1 2 4 1 2 3
2 1 3 4 1 3 4 2 3 4 2 1 4 2 1 3
3 2 1 4 2 1 4 3 1 4 3 2 4 3 2 1
4 2 3 1 2 3 1 4 3 1 4 2 1 4 2 3
1 3 2 4 3 2 4 1 2 4 1 3 4 1 3 2
1 2 4 3 2 4 3 1 4 3 1 2 3 1 2 4

On the other hand, if you run four “pick A or B” polls, you only get 16 different outcomes:

A A A A A A A B A A B A A A B B
A B A A A B A B A B B A A B B B
B A A A B A A B B A B A B A B B
B B A A B B A B B B B A B B B B

So, you have a testing strategy that can give you only 16 different results, and based on those results you have to find which one of 24 situations you are in. This is impossible. You need a fifth test, which will increase the number of results to 32. Now you’re earning $400 but paying $500…

Adding an item to the list multiplies the number of permutations by the number of elements in the list: adding a tenth item multiplies it by ten. On the other hand, adding a new poll only multiplies the number of results by two. This means that the number of polls will increase faster than the number of items. No matter what per-item price you select, you will run out of money sooner or later. Keep this in mind: it’s going to be very important soon.

This plot represents cost (red) and earnings (blue) as a function of the number of items, for a per-item price of $1,000 :

nlogn

The set of all problems that can be solved for a profit, regardless of size, at a per-item price, is the set of problems with linear cost; in computer science lingo, we write this cost as O(n), meaning that you can find a constant (here, the price) such that the cost will be smaller than n times this constant for a problem of size n. Looking for words in a dictionary is a known O(n) problem. Sorting a list of items by comparing them to each other has been proven not to be an O(n) problem (and the description above is a summary of the proof).

Suppose you go for a different pricing method: to sort a list of n items, you ask for $100 × n × n. Two items is $400 (cost: $100), three items is $900 (cost: $200), four items is $1600 (cost $500). You are guaranteed to make a profit because the worst that could happen is that you compare every item to every other item, which means polls, and you would still get enough to pay for all of them.

Sorting a list of n items is hence known to be an O(n²) problem.

Remember the definition of linear cost? There exists a constant k such that the cost of working with n items is smaller than than k × n. There is a similar definition for polynomial cost: there exists a constant k such that the cost of working with n items is smaller than nkn multiplied by itself k times. All linear-cost problems are, by definition, polynomial-cost problems as well.

In the expression “P = NP”, the “P” represents all problems that can be solved at a Polynomial cost.

Non-Deterministic Polynomial Complexity

Let’s think of another business idea. Now, you’re a tour optimizer: salesmen give you a list of cities they need to travel through to sell their wares, and you provide them with a sequence of visits that has the smallest travel costs and never goes more than once through any given city. This is a common toy problem in computer science for which no polynomial-cost solution is known—The Traveling Salesman Problem.

As a quick comparison:

  • Spell-checking a 10-word text involves 10 dictionary searches – O(n)
  • A naive sort of 10 elements takes, at most, 100 comparisons – O(n²)
  • The traveling salesman problem with 10 cities involves testing 3,628,800 paths.

So, you set up a contest: you hand out the problem to a billion minimum wage workers, and you only pay the worker who found the shortest travel path. So, each of them starts computing the length of a certain path, and there are so many workers that every path will be worked on. After a while, they all submit their solutions, and you keep the shortest one.

Each worker spent a polynomial time selecting a travel path and computing its length: they only had to add the distances between consecutive cities in the travel path. So, the cost of solving the problem this way is polynomial. The important thing here is that you did not know what worker you would pay until the answers came in. In computer science, we say that your solving strategy is non-deterministic.

In the expression “P = NP”, the “NP” represents all problems that can be solved at a polynomial cost using a non-deterministic solving strategy — Non-deterministic Polynomial.

P = NP ?

Any P problem is also an NP problem : if you can solve it by paying someone, you can solve it by holding a contest and paying the winner that same amount. The question is whether any NP problem is also a P problem (hence, P = NP) : if you can hold a contest to solve the problem, can you hire someone to do the same while retaining a polynomial cost?

This question is important because our current computing technology (electricity-through-silicon-transistors) does not support non-deterministic computing. So, if we could find efficient deterministic solutions to every problem inside NP, we would be quite happy about it.

Actually… we wouldn’t. Many security features in computer science rely on problems such that 1° solving the problem is very complex but 2° checking whether a solution is correct is very simple. For instance, checking that a password is correct is a P problem (your computer does it several times a day), but finding the password is an NP problem (you have to try all passwords). If a P solution could be found to the “find the password” problem, we would be in trouble.

8 Responses to “P = NP for Normal People”


  1. Thank you! I was looking for exactly that kind of explanation.

  2. Victor Nicollet - August 12, 2010 at 4:01 pm - Reply

    I knew a money-based explanation would help people like you ;)

  3. Thanks for the explanation, it is quite clear. However, I must disagree on your base postulate that Pepsi > Coke Light > Coke. Anyone with functionning taste buds can judge for themselves that the order is wrong.

  4. @Lix: While I agree with you that Coke >>> Pepsi, turns out its not exactly the taste buds that decide this: In experiments where both drinks labels were hidden the Pepsi scored more(!) but in experiments where Coke was labeled visibly it scored more! (http://health.usnews.com/usnews/health/briefs/mentalhealth/hb041025b.htm ). So our choices which is better relies more on our brain-washed so-called knowledge than our taste buds. I guess the ultimate drink would be Pepsi labeled as Coke…

  5. @Iftah: But when in a blind test I’m given Pepsi and Coke, I still pick Coke over Pepsi. The tastes are very distinct and I just find Pepsi too “pointy”.

  6. Hey may I reference some of the information found in this post if I link back to you?

  1. 1 Tweets that mention P = NP for Normal People -- Topsy.com

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>



1208 feed subscribers
(readers who polled a feed this week)