Tag Archive for 'Learning'

Information Flow

The real world is a complex place. When writing software that has to interact with the real world, there are literally thousands of concepts you have to master and tens of thousands of details you have to be aware of, or you will paint yourself into a corner where your software clashes with reality. And reality always wins.

Understanding concepts and details is a fundamental part of a project’s time budget, whether they come from the project requirements, real-world constraints, third party code or teammates. Every time information goes around in a project, it uses up valuable time, and to keep the time budget tight it becomes necessary to decide what information should be allowed to go around, and where.

Working on concurrent systems is an enlightening experience, because of the many similarities between an array of computers and a team of information workers. Computers arrays have latency issues when one thread depends on another thread to be done…

“When do you think your settings import module will be done? I’m stuck on the payment API until I can load those settings!„

…they have bandwidth issues and manipulating some data yourself is usually faster than sending the data to another part of the cluster for treatment…

“The User object? Well, it’s a bit of a weird design, but it’s rather clever. I’ll draw you a quick UML sketch on the blackboard so you can see what the five helper classes do.„

…they have to avoid data loss if a computer or network is down…

“I have no idea how this stored procedure works, you should ask Tim, he’s the one who wrote it. He’s in southern France right now but I think he’ll be back next month.„

… and they have to handle a directory of parts and a garbage collector for data…

“Wait, nobody’s written the comment moderation back-office! Who was in charge of doing it? Who wrote the comments front-end anyway?„

There are algorithms, strategies and techniques for handling and optimizing those things. Many of these can be adapted to humans, with the added benefit that, humans being smart, they can understand the point of those algorithms and compensate for minor flaws if the plan isn’t perfect.

HADOPI

The HADOPI law was recently adopted by the French parliament. I have an issue with this law: it deals with technical things that few people are knowledgeable in (including most members of parliament) such as the Internet Protocol and the use of IP addresses.

So, I have made a small effort on this website to explain the basic technical principles underlying the HADOPI law [fr].

Inline Help

Your average user does not know what a trackback is. Yet, Wordpress must let experienced users ping trackbacks. How to include trackback information without scaring away the inexperienced users (and possibly even educating them along the way)?

The solution chosen by Wordpress is to use slightly more verbosity to describe what trackbacks are:

wordpress-back

By contrast, Magento involves a lot more knowledge than just publishing a blog. While it’s possible to assume that Magento users earn their wages by knowing how to use it, most of the time they only manipulate the system as a side-effect of handling sales in their brick-and-mortar company, or they are new to sales altogether. So, the result is a complex product with outright terse field descriptions that have very complex effects:

magento-backWho, among normal Magento users, knows what a meta keyword or meta description is? Or whether meta keywords are as useful now as they were in the good old days? But that’s not very important, since you could just ignore those fields.

What’s an URL key? Now, even an experienced web developer might have trouble with this one (it’s the equivalent of a Wordpress slug, except Wordpress displays that “URL key” field as a more easily understood “Permalink: http://www.nicollet.net/2009/05/inline-help/” with the last URL segment editable).

What about the Page Title? What’s the difference between the category name and the page title? Which is displayed where?

Things get even more complex with other areas of Magento. For instance, in a lot of places a given field is disabled and a “Use Config Defaults” checkbox next to it is checked. The problem is that there is no indication about where those configuration defaults can be changed.

Last but not least, there are several classic questions to be asked, such as “Why does this product appear as out-of-stock?” or “Why doesn’t this product appear in this category?” or “How do I set a table rate?” which require careful analysis of the dozen options that might affect the actual state of the item (enabled? present in website? present in category? stock greater that zero? set as in-stock? children items enabled, present in website and set as in-stock? …)

Inline help items

Modern Javascript libraries allow altering an existing page in a non-intrusive manner by decorating the page elements once they are loaded (or when the user asks for help). This means it’s possible to add novice-specific inline help as:

  • Short text snippets under complex or technical items to explain what they are, possibly with a link to the complete explanation.
  • Explanatory tooltips on hovering.
  • See-also links (where do I change this configuration default? where do I set table rates?) placed exactly where the user might be wondering something.
  • Move-along tutorials that detect whether data was entered in certain fields and guide the user to create items, add them to categories, and so on…
  • Troubleshooting checklists (with links) to determine why a product does not appear, or why it appears out of stock.

In fact, it would even be possible for IT companies to provide their customers with customized help messages that match their specific internal processes.

JITBrain

Some of you may have wondered, what is this secret project he is working on?

The project is called JITBrain. In itself, it’s nothing quite groundbreaking, merely an issue tracking platform that services two categories of users:

  • Individuals that have a lot to do but never seem to manage it. These are helped by features such as extremely simple todo-list manipulation, a search engine for looking at previous tasks and retrieving important information, and helpful statistics.
  • Teams that must collaborate on projects. These are helped by features such as issue tracking, simple workflows, attached links and files, planning poker, reporting charts, and motivational gizmos.

The unusual thing about the project is the development method: it builds upon my earlier ramblings on snippet-oriented development (developing quality software by actively thinking of the people who will read and reuse the code) to actually become an advanced tutorial in itself. This brings yet another benefit, since the code is not only designed in short bursts of concisely explained functionality, but a complete documentation and log is being written along with it.

As such, it has the additional purpose of serving as a simple reference for “classic mistakes” both in technical and functional areas in the development of a website.

The tutorial advances along with the development of the system, which means you can see the tutorial or look at the website.

The C++ hits you! You feel confused.

If you’re reading this post, you’re probably either a beginner intending to learn C++, a veteran keen on flaming my teaching philosophy, or someone searching for the origin of the “The foobar hits you! You feel confused.” Nethack quote. Either way, welcome.

I intended this article as a short introductory meta-note about learning C++. It should give you some elements that can be helpful in understanding C++ books and tutorials. If you’re really keen on doing this, I suggest Thinking in C++ and the C++ FAQ Lite, both of which are freely available online and are quality resources.

And stay away from any tutorials that #include <iostream.h> !

Continue reading ‘The C++ hits you! You feel confused.’

Published Articles

I have recently contributed two articles to two game development books.

These are:

Both are in the gamedev.net collection.

Beginning OCaml, Part 1

Perhaps you’ve never programmed. Or perhaps you know of other languages, and wish to learn others—this post will discuss only pure functional programming, so I suggest you forget everything you know and start anew. Either way, this series of articles provides a quick introduction to the functional side of Objective Caml.

Expressions

The vast majority of code you will write in Objective Caml will be expressions. An expression is usually a mathematical formula or something close to one. The simplest expression is a constant, such as:

1

This is not a very useful expression. A slightly more useful example would be using Objective Caml as a pocket calculator, with expressions like:

(2 + 4) * 3

Objective Caml runs expressions by computing their value: this is known as evaluation. The above expression would result in the integer value 18. As we discover more elements of the language, the rules for evaluation will become more complex, but the fundamental principle remains almost intact: an Objective Caml program is an expression or series of expressions, running the program means evaluating those expressions.

Variables

The first step in raising Objective Caml above the lowly pocket calculator level is the ability to give names to objects.  The language construct for doing so is borrowed from mathematics, where mathematicians would say “let x be the smallest integer such that …” and refer to that integer as “x” from then on. The syntax for doing so is:

let (variable) = (expression) ;;

Use the semicolons for now—we will see later on that they are optional in some circumstances. Example of such a definition:

let x = (2 + 4) * 3 ;;
x - 2

This will evaluate the expression (which yields the value 18) then bind that value to the name x. Every expression below the definition of x will know that x equals 18, so the second expression would evaluate to 16. Here, x is called a variable—a confusing name, since it does not actually vary: once it’s defined, it stays forever. Any name can be used for a variable, as long as it starts with a lowercase letter and contains only letters, numbers and underscores.

It is of course possible to define more than one variable in a program:

let x = 3 + 4 ;;
let y = 2 * 5 ;;
x + y

This example evaluates to 17. A normal Objective Caml program defines hundreds and even thousands of variables in order to work.

Having a thousand variables creates a risk for collision. The good news is that collisions are handled smoothly: at any point in the program only the last definition found so far counts. So, if we were to consider an example:

let x = 3 ;;
let x = x + 1 ;;
x + x

Line 1 defines x as 3, line 2 defines x again, this time as 3 + 1 = 4, and line three evaluates to 4 + 4 = 8.

This doesn’t solve the problem entirely, though: what if I accidentally overwrite a previously defined value, and I need that value later on?

Local Definitions

Objective Caml solves this by providing local definitions: instead of making a variable available to all lines that appear below it, the variable is only available within an expression. The syntax is:

let (variable) = (expression) in (expression)

The variable exists only within the second expression, and uses the value of the first expression. For instance:

let x = 1 ;;
let y =
  let x = 2 in
  x + x ;;
x + y

This example defines x as 1. Then, within the definition of y, it defines x again as 2, which makes the value of y equal to 2 + 2 = 4. However, the second definition of x is only available within the “x + x” expression, so the “x + y” expression uses the original value, and the result is “1 + 4 = 5″.

Local definitions are expressions. This means that they can be used as part of other expressions, such as on either side of a mathematical operator, or within other definitions. It’s perfectly legal to write code like:

1 + (let x = 1 + 2 in x * 2)

This evaluates to 1 + (3 * 2) = 7.

In practice, most variables are defined locally, and only the most important variables are defined globally.

Functions

While useful, the above features still don’t get very far beyond basic calculation needs. The one feature that turns Objective Caml into a highly expressive tool is functions. A function follows the mathematical tradition of being a mapping: it associates every element of a set with an element from another set. The element from the first set is the argument, and the element from the other set is the return value.

To call a function, you provide it with an argument and it is automatically turned into its return value for that argument. For example, the function “string_of_int” returns a bit of text representing an integer, so you could use it like so:

string_of_int 10

This would evaluate to the text “10″.

How do we define functions? Well, we simply write an expression which uses the argument and evaluates to the return value. Of course, the function doesn’t know the value of its argument until it’s called. So, when we write the function’s code code, we use a placeholder name to represent the argument: this variable is called the parameter, and it is replaced with the argument itself when the function is called. The syntax is:

fun (parameter) -> (expression)

For instance, a function that adds two to a number can be defined as:

fun x -> x + 2

It is not very useful as such, so let’s give it a name and call it:

let add = fun x -> x + 2 ;;
add 3 * add 4

This evaluates to 5 * 6 = 30, and illustrates the main point of functions: they allow you to describe an operation once, and use it in many places simply by using the function’s name.

The Objective Caml language provides two shortcuts for defining functions. The first was designed to write functions that returns functions. Suppose that I write:

fun x -> (fun y -> x + y)

This can be elegantly shortened to:

fun x y -> x + y

The second shortcut was designed to make the definition of named functions easier. Suppose that I write:

let add = fun x -> x + 2

This can be elegantly shortened to:

let add x = x + 2

The two shortcuts can be combined, so that:

let add x y = x + y

Means the same as:

let add = fun x -> (fun y -> x+y)

You can see all the above (along with the general structure of the tutorial) on this page.

Heaps of Knowledge

A question I like to ask for evaluating programming skill is, “Implement Heap Sort” (in whatever language I need programming in).

Heap sort looks something like this (the example is in C++, but the implementation is fairly similar in just about any imperative language):

void swap(int &a, int &b) {
  int c = a;
  a = b;
  b = c;
}

void sift(int *data, int i, int size) {
  int l = i * 2 + 1, r = l + 1;
  int argmax = i;
  if (l < size && data[argmax] < data[l]) argmax = l;
  if (r < size && data[argmax] < data[r]) argmax = r;
  if (argmax != i) {
    swap(data[i], data[argmax]);
    sift(data, argmax, size);
  }
}

void heapsort(int *data, int size) {
  for (int i = (size-1)/2; i >= 0; --i) sift(data,i,size);
  for (int i = size; i >= 2; --i) {
    swap(data[i-1],data[0]);
    sift(data,0,i-1);
  }
}

This short version requires some experience to get right, as well as knowledge of the algorithm. I suspect that a large number of programmers have never heard of heap sort, and of those who have heard of it, few will know how to implement it on their own. Of course, if you ask someone who knows how to write, you will usually get a pleasant surprise. But most of the time, you’ll be facing someone who just doesn’t know the answer. That’s the entire point: what people know is important, but it’s far more important to know how people will react when they don’t know (and especially, how quickly they can learn).

If the programmer doesn’t know what heap sort is, you can nudge him in the right direction by explaining how heap sort works. Basically, heap sort is an improvement over the naive selection sort: instead of simply extracting the maximum element from the unsorted area in the array, heap sort turns the unsorted area into a heap—a binary tree where the root is the maximum element of the tree and each sub-tree is also a heap—thereby making extraction of a maximum a Θ(log n) task instead of the usual Θ(n). The trick behind heap sort is that the unsorted area of the array can be seen as a heap, by deciding that every index in the array is a node in the tree, with the left child being at index (2n+1) and the right child being at index (2n+2).

From there, it’s fairly elementary to deduce that the algorithm should first turn the entire array into a heap, then remove the first element, restore the heap, and repeat the process until the heap is empty. The sift function itself should be guessed fairly easily by anyone with any experience in recursive functions (it is, after all, merely a question of constructing a large heap from two smaller heaps).

The benefits I see to the question are:

  • See how the interviewee or student handles a situation where he has no idea about how to answer a question. Do they stare without saying a word? Do they get angry at an impossible question? Or do they admit they don’t know, and try to gather more information?
  • Determine how much general Computer Science knowledge the candidate has. As you explain Heap Sort, do they already know what heaps and selection sort are? Do they have to ask what a binary tree is? Do they have to ask what an array or a sort is, or how to swap two values?
  • Detect positive laziness and knowledge of standard libraries. Does the candidate mention they’d use the standard library sort in production code? In C++, do they already know of std::make_heap or std::swap?
  • Test basic knowledge of recursion and iteration. Does the interviewee ‘get’ the principle of sifting on the first try? Are the bounds for the two for-loops correct? Can the interviewee evaluate the complexity of the function?
  • Test basic knowledge of the language idioms. Are the functions declared correctly? Are the control structures used correctly? Is the choice of a storage type appropriate for the array?

An Object Philosophy : The Tools

There comes a time in every programmer’s career when the question of Object-Oriented Programming crops up. Various sources attempt to explain Object-Oriented Programming, though subtle nuances exist that often make such definitions incompatible with each other. Even worse, every programmer usually builds over the years his own version of what OOP really is about, and severe clashes can happen as a consequence.

Wikipedia’s approach is a quite pragmatical one: it accepts the existence of multiple definitions, and merely attempts to explain those concepts that are shared by most. Some define the core of OOP as the ISP, LSP, OCP, DIP and SRP acronyms principles. Others take a build-your-own approach from a set of basic premises. It only seems fair that I could get my own stab at it as well.

Object-Oriented Programming is usually used to refer to two distinct yet related concepts:

  • A set of tools that can be used for design and for programming: classes, objects, messages, inheritance, code metrics, design patterns…
  • The ways in which that set of tools may be used to construct software that guarantees certain benefits, such as low-cost maintenance or reuse of existing code.

It is impossible, and foolish, to separate the two. Using classes, objects and the other tools without proper discipline or skill does not bring any of the benefits that are associated with OOP—even worse, programmers who come expecting proper OOP use of these tools will be surprised and will often have a lot to do to correct the situation. No single Object-Oriented concept is a magic bullet capable of working wonders on its own.

Today, I’ll discuss the tools. The next article in the series will deal with the mindset.

Object-Oriented tools are language-independent and implementation-independent. Most languages have their own ways of providing support for these tools, and they can sometimes even be used in those languages that don’t provide any kind of helpful support. Considering Object-Oriented Programming as “What happens in Language X” often leads to trying to replicate language X features in language Y, even though language Y also provides support for OOP in a different way. Besides, since programming languages seldom provide the entire OOP toolset out of the box, restricting oneself to a single language also reduces the horizon of potential tools.

Objects

One thing everyone manages to agree on is that Object-Oriented Programming is about objects—what objects are, or how they should be used, is already a matter of disagreement in certain places. My own take on objects is that you don’t need to know what they are, you merely need to know how to manipulate them. This means that an OOP object can be implemented as a Java object, an Objective Caml abstract type or a C structure, as long as it behaves like OOP objects should.

First, where do objects come from? Before you can manipulate them, you must create them. The easiest way of getting a brand new object in your program is to create it yourself from scratch. Writing the string “Hello” in your source code will automatically create a string object representing “Hello” when the program is run. Most programming languages provide means of creating certain categories of objects directly in the code: strings, integers, floating-point numbers, booleans, functions and classes are typical examples. Yes, functions and classes are, from an Object-Oriented Programming perspective, objects. Some programming languages also allow the creation of arbitrarily complex objects on-the-fly as a literal (the Objective part of Objective Caml allows this, for instance), but most languages only allow the creation of complex objects from classes. I’ll get back to this class business later.

Messages

The sole purpose of objects is to receive messages.A message is a one-shot two-way communication channel between a piece of code and an object. The code sends a request to the object, and the object sends a response to the code. What the response looks like is up to the object, two distinct objects may respond differently to the same request and even a single object may respond differently to the same request at two distinct points in time. Both the request and the response carry some data—this data is usually a set of objects. A request also has an identifier that helps determine what the request is about: a “getAccountBalance” identifier means that the request is about getting the balance of an account, while a “setDuration” means that the request is about changing the duration of something.

Not all objects can process all messages. For instance, one cannot expect a rubber duck to respond to a “getAccountBalance” message. So, not being able to process a message is a possibility to be considered—one of the main advantages of having language support for OOP is that statically typed languages can check at compile-time whether all messages are going to be processed successfully. But, again, more on that later.

At this point, most OOP manuals summon forth a class definition of some form and happily declare “this class describes sets of objects, its methods define messages that the object can receive”. I find that approach quite annoying, because it forevers equates in the minds of the readers object with instance of a class and message with class method—the consequence being that readers then find themselves at a loss when they encounter languages with class-less objects (such as Objective Caml) or class-independent implementations of Object-Oriented Programming (such as Service-Oriented Architectures).

Everything is an object

So, my example is going to be one simple piece of C pseudocode instead:

int squared(int x)
{
  return x * x;
}

printf("%dn", squared(10));

In C terms, this pseudocode defines a function that returns its squared argument, then outputs the square of 10 on the standard output.

In Object-Oriented terms, this creates a new object and gives it the name squared. It does so by using a “function literal” which creates a brand new object by specifying what the object does when it receives a “function call” message: in this situation, the behavior would be to extract an integer object from the request, send that integer a “multiply by yourself” message, then respond with the result. Our example then sends the “function call” message associated with the integer literal 10 to the squared object, then binds the response to another “function call” message sent to the printf object.

In other words, a plain old piece of C code without a single occurrence of the class keyword can be interpreted as an object-oriented program. This extends even further: any piece of imperative code can be interpreted as sending messages to objects. Going even further, even pure functional code without even a hint of imperative design can also be expressed as sending messages to objects: every function is an object that can process the “function call” message.

So, in languages without classes, you can usually find:

  • Primitive objects : integers, booleans, floating-point numbers and similar kinds of values are objects which are created from literals. They accept a wide range of messages which correspond to their operators (add, multiply, subtract, boolean-and, boolean-or, and so on).
  • Aggregate objects : C structures, OCaml record types, Javascript objects (ignoring the this keyword) that are a collection of sub-objects that can be accessed by name. They support “set value X” and “get value X” messages for every member X. They are created as literals by assigning the appropriate names with the appropriate initial values. They may or may not require preliminary definition of their type (for checking purposes in strongly typed languages) and extension of their contents by sending a “set value Y” message for an yet-undefined member Y.
  • Functions : these objects accept only one message (”call function”). They are created as a literal which describes how the data is extracted from the request, what operations are performed on it, and what data is sent back as part of the response.

Semantic shifting

If Object-Oriented Programming was all about looking at previously written programs and noticing how they are Object-Oriented, there wouldn’t really be much of a point to doing it. The good news is that OOP is not only about observing objects within existing code: it can also be used to introduce brand new behaviors into existing objects. This happens by performing a semantic shift of what a message and an object is: you decide that a certain construct you just invented is an object, and that another construct is a message.

Let’s look at our C pseudocode example again:

int squared(int x)
{
  return x * x;
}

This can be interpreted as creating a new object, “squared“, which can handle the “function call” message. But it could also be interpreted as defining a new message, “squared“, which can be sent to integers. After all, since this is all a matter of interpretation, both interpretations can be correct at the same time: as long as squared(10) is a one-shot two-way communication channel between some code and the integer 10, it’s a message.

This semantic shift can be applied in almost all situations: a function with at least one argument can be seen as both an object that accepts a “call function” message, and as a message that can be sent to one of its arguments (with the other arguments as its associated data). So, now, you can go around and extend any of the objects that the programming language provides you with by creating the appropriate functions.

Polymorphism

Back to our “squared” message. That message can certainly be applied to integers, but it’s equally valid to apply it to floating-point numbers. Besides, if we (by some miracle, or perhaps some feature of the language) managed to create a new kind of object which also supported multiplication by itself (such as a complex number) then the “squared” message could also be sent to that object. This is because that message merely requires that the object it’s applied to supports multiplication by itself and so it can be applied to any such object. This is called polymorphism: when a message can be sent to any object, regardless of its nature, as long as it supports a certain set of operations. This provides the benefit of avoiding code repetition: you only define your “squared” message once, and then you can apply it everywhere. In a theoretical, ideal object-oriented programming language (the part is played here by Smalltalk, a very nice OOP language), you would just write this line once:

squared := [:x | x * x ]

Interfaces

The problem is that languages don’t play nice with the concept of polymorphism. To determine whether a program behaves correctly at runtime, the language uses a type system. This type system allows expressing certain constraints over the values, and some rules to determine which constraints imply that the program works.

The C type system is a fairly crude one. Every value has a type, with types being mutually distinct. So, you have the type of all integers, and the type of all floating point numbers. And a function argument has only one type. So, our polymorphism example, applied to two types, becomes:

int squared(int x)
{
  return x * x;
}

float squared(float x)
{
  return x * x;
}

The problem, of course, is that there is no way in the C type system to express the constraint “supports multiplication by itself”. So, although the semantics of “multiply the argument by itself” are clear, the C compiler cannot understand that it’s safe to define one such function. For that matter, the Java, C++ and Objective Caml type systems won’t let us define such a function either. This is one of the reasons I chuckle when I hear Java advocates tout Java as a “pure object-oriented programming language”: despite all the effort poured into creating a clean object model, a lot of elementary Java operations (including multiplication) are not messages, and several elementary Java types (including integers) are not objects. If you’re looking for a pure OOP language (you don’t need one to do proper OOP, of course), I strongly suggest Smalltalk.

Back to the point: languages that support OOP actively, yet have a static type system, provide the notion of interface to help with type-checking. An interface is a set of constraints and requirements, stating what messages can be received by the object, what data has to be sent as part of the request, and what data will be returned as part of the response. So, one can express an interface with a “multiply by” message, and the define the “squared” function to work on all objects with that interface: the language will then see that the interface provides the “squared” function with all it needs (namely, the “multiply by” message) and consider the types to be correct.

In Java, the interface and function would look like this:

interface Multipliable
{
  // Accepts "multiplyBy" message
  Multipliable multiplyBy(Multipliable);
}

Multipliable squared(Multipliable x)
{
  // Send "multiplyBy" message
  return x.multiplyBy(x);
}

Of course, we had to use explicit message names here, because the multiplication operator in Java is not a message supported by the type system. On the other hand, Objective Caml is a tad smarter, because it can guess the interface based on the messages being sent:

let squared x =
  x # multiplyBy x

(* squared : (< multiplyBy : 'a -> 'b ; .. > as 'a) -> 'b *)

Another consequence of Objective Caml’s more expressive type system is that it allows retrieving the correct type. For instance, if you pass an integer into Java’s “squared” function, you would get back a Multipliable object, but you wouldn’t know if it was an integer or something else. This can be annoying, because you, as a programmer, know that it’s an integer, but the language won’t let you use it as such until you remind the compiler that it should trust you (and it will still check it at runtime). By contrast, Objective Caml types are parametric. An integer would probably have a “multiplyBy : integer -> integer” function, so the squared function would correctly return an integer.

But all of that is idle bickering about type systems that deviates from the purpose of the article.

Classes

A very important element of object-oriented programming is the notion of class. Earlier on, I said that objects could only be constructed at runtime by literals. That is, you create a new object and set its various properties right away. This, of course, makes it difficult to define categories of similar objects, because you have to define them one at a time and also make sure they share similar properties.

Another way of creating objects has been proposed in several programming languages, to the point of becoming (mistakenly) a synonym for OOP. A class is an object—even though some impure programming languages do not treat it as such—which defines a single message called a constructor. Upon receiving a constructor message, the class creates (by some arcane magic implemented by the compiler) a brand new object with a predetermined set of properties that are a combination of the data sent with the constructor message and the data provided when the class itself was created.

The exact syntax for sending the constructor message varies from language to language, but usually involves something along the lines of “new Classname(arguments)“.

In practice, a class defines two things:

  • The data to be carried by the object. This data is either created from scratch by the constructor, gathered from the global scope, or extracted from the constructor message itself.
  • The list of messages that the object can process, along with the code to perform said processing.

Again, the exact syntax for defining a class varies. In Objective Caml, the constructor message is described as part of the class name and used to initialize the values in the object, while the messages to be processed are called “methods”:

class vector2d (x,y) =
object
  val x = x
  val y = y
  method length = sqrt(x *. x +. y *. y)
end

let v = new vector2d (1.0,2.0) in
  v # length

[Edit : corrected small typo] In Java (as well as C++ and C#) the constructor is described as if it were a method, which can sometimes be confusing. Also, the language does not automatically guess which interfaces are implemented by a certain object, so they have to be specified as part of the class definition.

class Vector2d
{
  private float x;
  private float y;
  public Complex(float x, float t)
  {
    this.x = x;
    this.y = y;
  }
  public float Length()
  {
    return Math.sqrt(x * x + y * y);
  }
}

Vector2d v = new Vector2d(1.0,2.0);
v.Length();

In Javascript, classes are constructors, with methods being defined as part of that constructor’s prototype:

function vector2d(x,y)
{
  this.x = x;
  this.y = y;
}

vector2d.prototype.length = function()
{
  return Math.sqrt(this.x * this.x + this.y * this.y);
}

var v = new vector2d(1.0,2.0);
v.length();

Class extensions

In fact, classes support another message besides constructors: inheritance, or extension. When extending a class or inheriting from a class (two ways of expressing the same concept), one create creates a new class which imports all the contents of another class, then adds its own set of contents. Most type systems apply a subtype relationship to inheritance, meaning that a function which can operate on an instance of the original class can also operate on an instance of the inheriting class.

Inheritance is a strange beast: on the one hand, it behaves a little like interfaces (and is in fact the only way of replacing interfaces in languages that don’t have them, such as C++) to provide polymorphism but on the other hand it is a non-polymorphic way of extending the functionality of a class. It happens quite often that entire programs avoid using inheritance at all, because interfaces solve all their polymorphism needs and they do not need the extension features of inheritance.

A summary

Object-oriented programming introduces the following set of concepts:

  • An object is a value manipulated by the program. Integers, booleans, strings, functions and classes are examples of objects. Objects are usually created using literals provided by the programming language.
  • A program works by sending messages to objects. A message transmits a request (with arguments) to an object and the object responds with some data. Calling a function on an argument can be seen as sending a message, and calling a member function or method can also be interpreted as sending a message.
  • An interface, in a statically typed programming language, represents a set of messages that an object must be able to process. This is used to ensure that a function argument can be used without causing a runtime error.
  • A class is a special object which can receive constructor messages. Every constructor message constructs a brand new object that is returned as part of the response. Usually, an object created from a class is said to be an instance of that class.
  • Classes can inherit from one another, which can be useful in some cases to extend its functionality in a non-polymorphic manner.

The next article in the series is here (starting on January 9, 2009).

Game Development Tutorial

Creating video games is not easy, but few things are easy: cooking for ten, writing a book or running ten miles are quite difficult as well. The difference is that with games, people don’t know how to start. To cook for ten people, you know it’s going to involve shopping for ingredients, using a kitchen to boil, simmer, roast and prepare them, and serve them. Writing a book involves typing on your keyboard until you’ve written all the hundreds of pages you wanted to write. Running ten miles involves getting your shoes on and going outside.

But how does one start creating a video game? How does the initial idea of a video game translate to an actual program you can run? This step isn’t the hardest of all steps in game development (far from it) but it deters a lot of otherwise well-motivated wannabes.

As a former game developer (I was the lead programmer for Darklaga : Cannonball Symphony) I will try to answer a few of these questions on the technical side. To this end, I have written a short video game which you can find here, which happens to be a reimplementation of Pong (one of the earliest video games). In this article, I will discuss video game development using that game as an illustration.

How Game Development Works

So, how does a video game get created? It usually follows the same set of steps:

  • A team of designers has a basic idea for a game. it designs most of the high-level details of the game: the look and feel, the various elements and their behavior, win and loss conditions, player controls, and so on.
  • A team of artists creates the graphics and sounds for the game, using their favorite editors and tools and recording studios, according to the designer’s vision.
  • A team of programmers writes the program itself, describing how, where and when the graphics should appear and the sounds should be played, and how the player’s input affects this, according to the designer’s vision.
  • Testers play the game, provide input to the designers, who alter the design to please the testers. The cycle repeats until the testers are happy or money runs out.
  • A publisher takes the game, packages it, sends it to retail stores and advertises it.

Of course, all of this could be done by a single person (Pong), or it could be done by an independent team of five (Darklaga), or it could be done by a professional experienced studio backed by a strong publisher (Red Alert 3).

In an ideal world, you would be able to pitch in an idea, and people would take that idea, create a game from it, publish it and give you a share of the money. This rarely happens in the real world, because games take time to be developed, there’s limited manpower to go around, and so those few idea-pitcher positions are already taken by people with vast amounts of experience and connections in the business. Tom Sloper has a few quite interesting articles about getting your game done, which you can read here.

So, if you have a great game idea, but neither connections nor experience in the business, you’re basically stuck with either giving up or doing it yourself. This will involve either convincing a programmer to work for you, or doing the programming yourself. If you want to try programming, read on.

Programming Games

Video games are computer programs: contraptions designed and written by programmers to make the computer behave in a certain way. Since programmers tend not to be happy with available tools, they often invent new ways of programming computers, which means there are many ways of creating a video game. For instance, the Pong game I wrote for this article is written using a programming language called Javascript (or ECMAScript, depending on whom you ask). The part of the program responsible for making the ball bounce on the screen edges looks like this:

if (this.ball.y + ball.h > field.h)
  { this.ball.vy = - ball.s } 

if (this.ball.y - ball.h < - field.h)
  { this.ball.vy = ball.s }

Different programming languages look different. Sometimes, differences are small, so an equivalent program in the C++ programming language would look like:

if (this -> ball.y + ball::h > field::h)
  { this -> ball.vy = - ball::s } 

if (this -> ball.y - ball::h < - field::h)
  { this -> ball.vy = ball::s }

These two are fairly similar, because both C++ and Javascript belong to the same family of languages (called the C family) and therefore share a lot of features and constructs. By contrast, Objective Caml is a programming language of the ML family, and has a quite different take on things:

method verticalBounce =
  if y -. Const.ball_h < -. Const.field_h then
    {< vy = Const.ball_s >}
  else if y + Const.ball_h > Const.field_h then
    {< vy = -. Const.ball_s >}
  else
    self

Different languages have different capabilities. I chose Javascript because it allows running the game in a browser with no downloading or applets. One could choose the ActionScript language to run a video game in a Flash or Flex applet, or the Java language for running a video game in a Java applet, both within a browser. Developing for Windows or the XBox could involve using the C#, F# or VisualBasic.Net languages. Developing for the PC and most consoles can also happen in the C++ and C languages, as well as one of the many BASIC language variants and many others as well. Writing the central server program for multiplayer games could involve the ErLang or Stackless Python programming languages for performance reasons. Writing a video game for a pocket calculator would involve using the assembly language for that calculator. Every programmer know several languages, and chooses whichever best fits the problem he is solving.

Code is written for humans to read and only incidentally. for computers to execute.
- Donald Knuth

All languages have a thing in common: they’re human-readable text which the computer does not understand, and as such they necessitate a translation phase where the program, written in a programming language, is transformed into a sequence of machine instructions that the computer can execute. This translation sometimes happens before running the program (done by another program called a compiler), and sometimes it happens while running the program (done by another program called an interpreter). Some languages, such as C++, tend to be always compiled, while other languages, such as JavaScript, tend to be always interpreted, and many programs are half-compiled and half-interpreted.

Almost everyone has a Javascript interpreter on their computer: Internet Explorer, Firefox and Safari all bundle one, because web pages tend to use Javascript to make their content dynamic and interact with the user. Because of this, all you need to start programming is a text editor (such as Notepad) to edit javascript files and a web browser (such as Internet Explorer) to view the result. I prefer to use improved tools (the Crimson Editor, Firefox and the Firebug plug-in) because they help me write programs faster.

If I have seen further it is only by standing on the shoulders of Giants.
- Isaac Newton

A very important aspect of programming (and computer usage in general) is the use of existing tools for accomplishing common tasks. Programmers who regularly solve the same problems eventually write down code to solve these problems, once and for all, and then store that code in code libraries. Sometimes, a library is useful enough that it deserves to be published and used by many other programmers worldwide. For instance, my Pong game needs to move green things around on the screen, which is a fairly common task that is already done by the jQuery Javascript library. In general, when a programmer has a difficult task to solve, he usually searches for an existing solution on the web before rolling out his own.

A Programming Primer

This article is not going to teach you how to program. That would require more space and time than I have here. What it can do, however, is teach a few basic elements of programming in Javascript, which you may also find useful in other languages.

The simplest way of executing Javascript is to write it as part of a web page. For instance:

<html>
  <head>
    <title></title>
    <script type="text/javascript">
<![CDATA[
   { write your javascript code here }
]]>
    </script>
  </head>
  <body></body>
</html>

Save this text with the “html” file extension, and start writing Javascript. Open the file in a browser to test it (depending on your security settings, you might have to enable scripts for this to work). Again, I strongly advise you to use the CrimsonEditor/Firefox/Firebug triad for working with Javascript.

Statements, Variables, Values

Javascript is an imperative language: it describes sequences of operations to be performed by the computer. Individual operations are called statements. For instance, the sentence $(document).text(”Hello”) is an operation which replaces the content of the document with the text “Hello”. You can chain several statements together by separating them with semicolons: the statements are then executed in order, left-to-right and top-to-bottom. One very useful statement is the “alert” statement, which creates a message box containing some text:

alert("This is my message")

If you’re using Firebug, you have access to the superior “console.debug” option, which is not as nasty as a message box and allows you to output things more complex than text:

console.debug("This is my message")

Programs manipulate values: these can be numbers, lists of other values, pieces of text, parts of a web page, or even parts of a program. The most elementary operation to be performed with a value is to store it in a variable: this is useful so that values are kept around when you’re not using them. A variable is just a name which is associated with a value: you can retrieve that value at any time by writing the variable’s name, and you can change the value of a variably any time you wish.

Creating a variable uses the “var” keyword, which is reserved for that purpose:

var zero = 0

This statement creates a variable, called zero, and associates it with the number 0. You can use a variable anywhere you would use its value:

var message = "This is my message" ;
alert(message)

This displays “This is my message” in a message box. You can also change the value of a variable after you’ve created it. That operation is called an assignment:

var message = "This is my message" ;
alert(message) ;
message = "On second thought, it isn't" ;
alert(message)

This displays “This is my message” followed by “On second thought, it isn’t”. The first value assigned to the variable is lost, replaced by the second value assigned to the variable.

You can manipulate values in many fashions. For instance, you can have arithmetic operations (+a, -a, a + b, a – b, a / b are fairly obvious, a * b is multiplication, and a % b is the remainder of dividing a by b), comparisons (a == b for equality, a != b for inequality, a < b, a > b, a <= b and a >= b), and logical connectors (a && b : and, a || b: or, !a : not).

Blocks, Functions

I have mentioned earlier that you can manipulate parts of a program. The easiest way of doing so (although somewhat limited) is to use blocks and control structures. A block is zero, one or more statements between curly braces, for instance { a ; b }. A control structure is a special statement which is followed by a block, and executes that block under special circumstances. For example, a conditional statement only executes its associated block if a certain condition is true:

if (age < 13)
{ alert("You are not allowed to view this website")  }

Control statements give you control over which statements are executed in certain situations. They are invaluable for expressing complex behavior in your programs. A typical extension of the conditional statement is to also specify something to be done when the condition is not verified:

if (name == "John") { alert("Hello, John!") }
else                { alert("Who are you?") }

By now, you have probably noticed that I position my blocks and statements around randomly. This is indeed the case: Javascript doesn’t care about the position of your statements and blocks. Just like in English, you can write your sentences every way you wish as long as you don’t split or swap your words.

Another typical control statement is the loop. It executes its associated block repeatedly as long as its condition is true. So, displaying a countdown from 10 to 0 would look like this:

var i = 10 ;
while (i != 0)
{ alert(i) ; i = i - 1 }

The loop displays the number, then substracts one from it, until the number is no longer different from zero.

The improved version of a block is a function: a function is, for all purposes, a block that can be manipulated as a value. This means that you can assign it to variables, keep it around, and execute it when you want it to. You create a function from a block by using the “function()” keyword in front of that block. You then call functions (which executes the corresponding block) by appending two parentheses after the variable name:

var scare = function() { alert("Boo!") }; 

if (surprise)
{ scare() ; alert("Sorry for scaring you!") }
else
{ alert("I'm going to scare you!") ; scare() }

Functions serve several purposed in a program. Their primary purpose is to eliminate repeated code: if you have code which is repeated in several places, turn that code into a function and call the function wherever you need without having to rewrite the code. This even works if you have code that is not identical, but still similar enough:

var i = 1 ;
var total = 0 ;
while (i <= 10) { total = total + i }
alert(total) ; 

i = 1 ;
total = 0;
while (i <= 20) { total = total + i }
alert(total)

These two pieces of code compute the sum of numbers between 1 and 10, and between 1 and 20. The only difference here is the number 10 (or 20). It is then possible to make this number a parameter of a function: when a function has a parameter, a value for that parameter has to be provided when the function is called. The parameter then becomes a variable associated to that value. In this case:

var sum = function(max)
{ var i = 1 ;
  var total = 0 ;
  while (i <= max) { total = total + i }
  alert(total) }; 

sum(10) ;
sum(20)

A function may have several parameters. If that is the case, then the parameter values must be provided in the same order. Parameters enhance the ability of functions to eliminate repetitive code.

The secondary use of functions is to serve in special situations where one needs to represent a block. For instance, the setTimeout() operation registers a function to be executed after a specified duration. So, for instance:

alert("First message") ;
setTimeout(function(){ alert("Third message") }, 3000) ;
alert("Second message")

This displays “First message”. When that message box is closed, it registers “Third message” to be displayed after 3 seconds (3000 milliseconds) and immediately displays “Second message”. Another example is jQuery’s “when document is loaded” operation, $():

$(function(){ alert("Document has finished loading") });

Note that you can define functions and assign them to a new variable in a single action using an alternative simplified syntax:

function sum(max)
{ var i = 1 ;
  var total = 0 ;
  while (i <= max) { total = total + i }
  alert(total) }

Objects, Classes

One other feature of Javascript is the ability to combine several values into one. For example, it’s interesting to store together in a single variable the horizontal and vertical positions of the ball in a Pong game, yet still be able to access them independently. In Javascript, an object is an aggregation of several values: these values are members of the object, and they are given names which allows the program to access them:

var obj = { x : 10, y : 20 } ;
alert (obj.x) ;
alert (obj.y) ;
obj.x = 30 ;
alert (obj.x)

This code creates an object with members x (equals 10) and y (equals 20). It then displays the value of member x, then the value of member y. Then, it changes the value of member x, and displays it again. The result is 10, 20, 30.

When it becomes useful to create many objects along the same pattern, as well as define functions which can operate on these objects, Javascript allows you to define classes of objects. A class is a template which allows the creation of objects which have function members that operate on the object. A class is a template used for the creation of objects. For instance:

function number(x)
{ this.x = x } 

number.prototype.increase = function() { this.x = this.x + 1 } ; 

number.prototype.show = function() { alert(this.x) } ; 

var n = new number(10) ;
n.show() ;
n.increase() ;
n.show()

This example introduces several new concepts, which are all necessary to understand classes:

  • The ‘prototype’ keyword indicates that the ‘number’ function is in fact a class, and defines two functions, ‘increase’ and ’show’, as being members of all objects of class ‘number’.
  • The ‘new’ keyword creates a new object of the ‘number’ class. That is, it creates an empty object, then adds every function inside the class prototype to that object. Then, it calls the function ‘number’ itself, with the provided parameter values.
  • The ‘this’ keyword acts as a variable. Whenever a member of an object is called,’this’ becomes equal to that object, so that any operations applied to ‘this’ will be applied to the object. When the ‘number’ function is called because of the ‘new’ keyword, ‘this’ becomes equal to the newly created object.

In detail, what the example above does is:

  • Define a function called ‘number’.
  • Decide that ‘number’ is in fact a class, and define member functions ‘increase’ and ’show’.
  • Create a new instance of ‘number’ : this creates an object, adds the ‘increase’ and ’show’ functions to it, sets ‘this’ to that object, and calls ‘number’.
  • When ‘number’ is called, it creates a new member ‘x’ and sets it to 10. So, the new object has a member ‘x’ equal to 10.
  • The program calls the ’show’ function. This sets ‘this’ to the object, and the function reads the ‘x’ member of that object (which equals 10) and displays it.
  • The program calls the ‘increase’ function. This sets ‘this’ to the object, and the function adds one to the ‘x’ member of that object (so it now equals 11).
  • The program calls the ’show’ function. This sets ‘this’ to the object, and the function reads the ‘x’ member of that object (which is now 11) and displays it.

Note that the value of ‘this’ is restored after a member function returns.

But I Can’t Remember All This!

Of course, you can’t. This kind of stuff takes time and patience to remember. So, take your time and read it a few times. If you don’t understand something, try it out to see for yourself what happens. If you really don’t get it, ask around. I’m willing to answer questions if you ask them in comments to this entry, and you can always ask questions around the good folks at gamedev.net anytime. After a few weeks of practice, you’ll be able to do most of this on your own.

Game Design

Before you jump right into coding (and drawing your assets, if applicable) you need to have a fairly complete and detailed description of the game you intend to develop. Otherwise, like a mason without a plan, the house you build is unlikely to ever be finished, and will crash to the ground in strong wind if it is.

The basic requirements for a pong game are as follows:

  • The Pong game contains a square ball and two rectangular paddles.
  • The ball moves at a certain speed along both horizontal and vertical axis.
  • When it hits the top or bottom of the screen, the vertical velocity is reversed.
  • When it hits the left or right edges of the screen, the horizontal velocity is reversed and the ball is moved to the center (horizontally). The player on the other side gains a point.
  • When it hits a paddle, the horizontal velocity is reversed.
  • Paddles are vertical, and there is one on each side of the screen.
  • They move up and down at a fixed speed.
  • When they hit the top or bottom edges of the screen, they stop moving.
  • They move slightly slower than the ball (so that “follow the ball” is not a winning strategy).
  • The player controls the left paddle, an AI controls the right paddle.

Writing the program

The game is a large piece of functionality. However, programming only allows us to express small bits of functionality. So, we will have to split up the program in smaller pieces before we can create it.

One common way of splitting things is the MVC architecture. MVC stands for Model/View/Controller, which can be explained like this:

  • The Model describes everything that happens behind the scenes. It doesn’t care about how the game will be displayed on the screen, or how the input from the player will be received. What it cares about is where the ball is, where the paddles are, how the ball should bounce when it hits something, and when scores should increase. This will be an entire object containing all the data required to describe the game, along with functions that compute the movement of the paddles and ball over time.
  • The View describes how the game should be displayed. It doesn’t really care why the ball and paddles move a certain way: all it cares about is where the ball and paddles are, and which way they are moving. It gets this information from the model, and displays it in one way or another.
  • The Controller is what makes the game interactive: it reads user input in one way or another, generates AI strategies, and enters all that data into the model. Once the model has computed the new positions and scores, it asks the view to draw it, then starts again.

Applying MVC splits the program in three parts that are much cleaner and, therefore, potentially easier to create. It’s usually considered a good first step when designing a program.

The Model

Since the view depends on the model, and the controller depends on both the view and the model, the first thing to be implemented is usually the model: it’s independent of anything else, and so it can be written on its own.

First, there is data to be stored that will change a lot over time:

  • The position of the ball, as the ball’s velocity.
  • The vertical position of each paddle, and its vertical velocity.
  • The score of each player.

Then, there’s also data that will not change:

  • The default horizontal and vertical speed of the ball.
  • The horizontal position and default vertical speed of the paddles.
  • The dimensions of the field, ball and paddles.

We can store these constants in objects that will never be changed:

var field    = { w: 320, h: 240 };
var paddle   = { x: 300, w:   2, h:  30, s: 76 };
var ball     = { w:   5, h:   5, s: 120 };

The changing data itself should be stored in a model object, which should store the different categories of model data in sub-objects:

function model()
{ var s = ball.s;
  this.left   = {  pos: 0,   vel: 0 };
  this.right  = {  pos: 0,   vel: 0 };
  this.ball   = {    x: 0,     y: 0,
                    vx: s,    vy: s };
  this.scores = { left: 0, right: 0 }
}

Before going on, let’s discuss a bit all these definitions. Obviously, this code creates a model class with data for left and right paddles, a ball, and scores. Constants are also defined to describe the dimensions of the field, paddles and ball, as well as the default speeds of the ball and paddles. This can be inferred fairly easily from the code if you know JavaScript. The unanswered question is, what do those numbers represent?

  • Since the play field is symmetric, I have chosen coordinates (0,0) to represent the center of the field. This way, the top and bottom edges are at -field.h and field.h respectively (which means the total height of the field is, in fact, 480 pixels and not 240 pixels), the left and right edges are at -field.w and field.w (total width of 640 pixels). In general, the width and height are actually half-widths and half-heights, and the position of an object is in fact the position of its center.
  • Speeds are expressed in pixels per second. This means that the ball traverses the field vertically in 2 seconds and horizontally in 2.6 seconds, and the paddles traverse the field vertically in a bit more than 3 seconds (all of this is computed without taking the dimensions of the elements into account).
  • The scores are points. You add one point to the winner’s score on each round.

Initially, everything is centered, and the ball moves to the bottom-right (and since it moves faster than the AI’s paddle, the AI always loses the first round, but that’s not really a problem because it lets the player get his bearings for nearly two seconds).

The most elementary thing that can happen to the model is be controlled: this makes the paddles move by changing their velocity. So, when a player presses a key, the velocity changes, and when the key is released the velocity resets to zero. The model doesn’t care about keys (that is the job of the controller), only about directions, so the movement functions have a direction parameter:

model.prototype.moveLeft = function(d)
{ this.left.vel  = d * paddle.s }; 

model.prototype.moveRight = function(d)
{ this.right.vel = d * paddle.s };

These functions are added to the model class and move the left and right paddle, at the appropriate speed, depending on the direction. Here, we assume that the direction parameter d will equal -1 for “up”, 0 for “don’t move” and 1 for “down”, and we’ll have to take care in the controller in order to guarantee that.

The other thing that the model does is make things move. In the computer world, movement is a quick succession of different images which give the illusion of movement (as with flipbooks). So, instead of working in a continuous motion, the model should skip ahead of time by a certain duration (known as the time step). Here, I’m going to choose a reasonable timestep: 10 milliseconds (that’s 100 steps per second):

var timestep = 0.01;

The update function is pretty heavy: it has to move the objects around (to match those 10 milliseconds) then determine if the ball should bounce, if the paddles should stop moving, and if the scores should increase. The entire function is:

model.prototype.update = function()
{
  var t = timestep; 

  this.left.pos  += t * this.left.vel;
  this.right.pos += t * this.right.vel;
  this.ball.x    += t * this.ball.vx;
  this.ball.y    += t * this.ball.vy; 

  var clip = function(pad)
  { var pos = pad.pos;
    pos = Math.min(pos,field.h - paddle.h);
    pos = Math.max(pos,paddle.h - field.h);
    if (pos != pad.pos) pad.vel = 0;
    pad.pos = pos;
  }; 

  clip(this.left);
  clip(this.right); 

  if (this.ball.y + ball.h > field.h)
  { this.ball.vy = - ball.s } 

  if (this.ball.y - ball.h < - field.h)
  { this.ball.vy = ball.s } 

  if (this.ball.x - ball.h < - paddle.x + paddle.w &&
      this.ball.x + ball.h > - paddle.x - paddle.w &&
      this.ball.y + ball.h > this.left.pos - paddle.h &&
      this.ball.y - ball.h < this.left.pos + paddle.h )
  { this.ball.vx = ball.s } 

  if (this.ball.x - ball.h < paddle.x + paddle.w &&
      this.ball.x + ball.h > paddle.x - paddle.w &&
      this.ball.y + ball.h > this.right.pos - paddle.h &&
      this.ball.y - ball.h < this.right.pos + paddle.h)
  { this.ball.vx = - ball.s }  

  if (this.ball.x - ball.h < -field.w)
  { this.scores.right++;
    this.ball.x = 0;
    this.ball.vx = ball.s } 

  if (this.ball.x + ball.h > field.w)
  { this.scores.left++;
    this.ball.x = 0;
    this.ball.vx = - ball.s } 

};

Let’s go through it step-by-step:

  • First, it moves objects around. This is done by adding (+=) the moved distance (t * speed) to the position of objects. This computes the final position of objects after that step. The next steps then handle collisions.
  • The first collision test is handled by the “clip” function, which prevents the paddles from leaving the screen (it’s a function because there are two paddles, so I created it once and used it for both paddles). The basic rule is: clip the paddle to the screen and, if it went outside, set its speed to zero. Since the position of the paddle is the position of its center, I have to take the height of the paddle into account in order to avoid leaving the field.
  • The next two collision tests happen between the ball and the top and bottom edges of the field: if the ball went beyond these, the vertical speed is set so that the ball moves back into the field.
  • The next two collision tests happen between the ball and the paddles. It tests whether the ball-rectangle hits the paddle-rectangle by checking whether any of them intersect (collision detection is a topic in itself) and if they do, sets the horizontal speed to move back into the field.
  • The last two collision tests check if the ball leaves the field on either side. When this happens, the score of the corresponding player is increased (++), the ball is moved to the center (x=0) and its speed is reversed.

So, if we call the “update” function, it computes the state of the game after 10 milliseconds. If we do it often enough, we can get the game state in real time. Now, we have to display it.

The View

The view is the part of the program responsible for displaying things to the player. It reads data from the model, and somehow maps it to whatever display techniques the program has.

In this example, I’m using the jQuery library. It allows the program to resize and move around DIV elements of a web page (by default, a DIV element is a rectangle with a set position). So, the first step I will take is create the HTML page which will be manipulated by the view:

<html>
  <head><title>Pong</title>
  <script type="text/javascript" src="jquery.js"></script>
  <script type="text/javascript" src="pong.js"></script>
  <style>
#playfield {
  color           : black;
  background-color: black;
  border          : 0px none black;
  margin          : 0px;
  padding         : 0px;
  position        : absolute;
  top             : 0px;
  left            : 0px }
div {
  font-size       : 1px;
  color           : white;
  background-color: #33FF33;
  margin          : 0px;
  padding         : 0px;
  position        : absolute;
  top             : 0px;
  left            : 0px }
div.score {
  color           : #33FF33;
  background-color: transparent;
  width           : 640px;
  font-size       : 20px;
  font-family     : courier new }
  </style>
  </head>
  <body>
    <input id="playfield"/>
    <div><div id="paddle_left"/></div>
    <div><div id="paddle_right"/></div>
    <div><div class="score" style="text-align:left" id="score_left">0</div></div>
    <div><div class="score" style="text-align:right" id="score_right">0</div></div>
    <div><div id="ball"/></div>
  </body>
</html>

This creates several rectangles, with identifiers, and also determines their color and margin. Note that the playfield is an “input” (because we need to receive input from the player, so we use this object as a trick). Also, all “div” elements are within “div” elements, to prevent their relative positions from influencing each other (because several elements with an absolute position are within the same body).

Most of the above should be fairly obvious to you (if it isn’t you can peek at a few online courses: html and css).

A point of interest here is the fact that the HTML file includes our script (pong.js) as well as the jQuery script (jquery.js). You can download jQuery here.

what is interesting about jQuery is that it allows selecting the objects quite easily, and then telling them to do things such as moving. An extremely important function in the view is this:

var move = function(avatar,pos,spd,size)
{ var x = pos.x - size.w + field.w;
  var y = pos.y - size.h + field.h;
  avatar.stop()
    .css({left:x,top:y})
    .animate({left:x+10*spd.vx, top:y+10*spd.vy},10000,"linear");
};

This function has four parameters: the ‘avatar’ is a jQuery object representing one of the “div” elements above (the ball, a paddle etc), the ‘pos’ is the position (with x and y coordinates) where that object should be, ’spd’ is the velocity (with vx and vy coordinates) with which the object moves, and ’size’ is the dimensions (with w and h coordinates) of that object. The dimensions are required because the position of a jQuery object (with ‘absolute’ positioning, as we are using here) is expressed as an offset between the top-left corner of the screen and the top-left container of the object itself, whereas the position in the model is the offset between the center of the playing field and the center of the object. So, variables ‘x’ and ‘y’ are computed to represent the object’s position on the HTML page.

This function stops the current movement of the avatar with ’stop()’ then sets the left and top coordinates with ‘css()’, and finally asks the object to move with ‘animate()’ by specifying the destination (which we compute by adding the distance traversed in ten seconds to the current position), the duration (10 seconds = 10000 milliseconds) and the animation style (”linear”, which means there are no accelerations and no brakes).

In short, this function should be called whenever the movement of an object changes: jQuery then takes care of animating the object until its movement changes again. This way, we don’t have to change the movement of objects every ten milliseconds.

However, this requires the view to store the previous value of movement, in order to determine when movement changes by comparing the new value and the old value (if they’re the same, no changes happened). So, the view object would look like this:

function view()
{ this.ball   = {   vx: null,    vy: null };
  this.scores = { left:    0, right:    0 };
  this.paddle = { left: null, right: null };
  this.avatar = { ball: $('div#ball'),
                  sc_l: $('div#score_left'),
                  sc_r: $('div#score_right'),
                  pa_l: $('div#paddle_left'),
                  pa_r: $('div#paddle_right'),
                  play: $('input#playfield') }; 

  this.avatar.ball.width(1.9*ball.w)  .height(1.9*ball.h);
  this.avatar.pa_l.width(1.9*paddle.w).height(1.9*paddle.h);
  this.avatar.pa_r.width(1.9*paddle.w).height(1.9*paddle.h);
  this.avatar.play.width(2*field.w)   .height(2*field.h);
}

This class does the following things:

  • It defines members “ball”, and “paddle” which contain the velocities (but not positions) of paddles and the ball. So, when velocities change, the view notices and asks jQuery to change the trajectories. It also defines the currently displayed score (so that, when these change, they are updated).
  • It defines the avatars: this is done by using the jQuery selector: $(’div#paddle_left’) selects the DIV element with the identifier ‘paddle_left’ that we defined in the HTML page earlier.
  • It resizes the avatars based on the constants, with the functions ‘width()’ and ‘height()’. Note that the width and height of the ball and paddles are smaller than they should be: this is a common trick for games, since players are bound to notice some close misses and interpret them as “this shouldn’t have missed”, making the image of the paddles and ball smaller reduces near misses and keeps the players happier.

Next, the view object uses a ‘render()’ function to display things. This function will use the ‘move()’ function we defined earlier:

view.prototype.render = function(model)
{ var move = function(avatar,pos,spd,size)
  { var x = pos.x - size.w + field.w;
    var y = pos.y - size.h + field.h;
    avatar.stop()
      .css({left:x,top:y})
      .animate({left:x+10*spd.vx, top:y+10*spd.vy},10000,"linear");
  }; 

  if (this.ball.vx != model.ball.vx || this.ball.vy != model.ball.vy)
  { this.ball.vx = model.ball.vx;
    this.ball.vy = model.ball.vy;
    move(this.avatar.ball,model.ball,this.ball,ball) } 

  if (this.paddle.left != model.left.vel)
  { this.paddle.left = model.left.vel;
    move(this.avatar.pa_l,
         {x: -paddle.x, y:model.left.pos},
         {vx: 0, vy: this.paddle.left}, paddle) } 

  if (this.paddle.right != model.right.vel)
  { this.paddle.right = model.right.vel;
    move(this.avatar.pa_r,
         {x: paddle.x, y: model.right.pos},
         {vx: 0, vy: this.paddle.right}, paddle) } 

  if (this.scores.left != model.scores.left)
  { this.scores.left = model.scores.left;
    this.avatar.sc_l.html(model.scores.left) } 

  if (this.scores.right != model.scores.right)
  { this.scores.right = model.scores.right;
    this.avatar.sc_r.html(model.scores.right) }
}

The render function always checks whether the current value inside the model is different (!=) from the value stored in the view (this). When this is the case, the ‘move’ function is called to move the appropriate avatar in the appropriate fashion (or, in the case of scores, the inner text of the score avatar is changed using the ‘html()’ function from jQuery).

The Controller

The last piece of the game is the controller. It should gather information from the user, compute the AI’s response to ball movement, and measure how long has elapsed since the last time the view has been refreshed and update the model accordingly.

This time, the controller has no reason to be an object, because nobody will manipulate it (except, of course, itself). Instead, we can just define some values inside a function and use them to do the job:

var gameModel = new model();
var gameView  = new view();

As a first step in implementing the controller, we need to extract the keypresses from the user and write a function that applies them. This is done as follows:

  var keys  = { up: false, down: false }; 

  gameView.avatar.play.keydown(function(event){
    if (event.which == 38) keys.up = true;
    if (event.which == 40) keys.down = true;
    return false
  }); 

  gameView.avatar.play.keyup(function(event){
    if (event.which == 38) keys.up = false;
    if (event.which == 40) keys.down = false;
    return false
  });   

  var playerMoveLeft = function(model)
  { model.moveLeft((keys.up ? -1 : 0) + (keys.down ? 1 : 0)); }

The ‘keydown’ function provided by jQueryhas a single parameter, which is a function. That parameter will be called whenever an unpressed key is pressed when the playing field is selected. The same happens for ‘keyup’, which reacts when a pressed key is released. The functions check what the key was by looking at ‘event.which’: a value of 38 is the “up” arrow key while a value of 40 is the “down” arrow key (39 and 41 are the left and right arrow keys, respectively). The functions also use the statement ‘return false’: this notifies the playing field (which is originally a text field from a form) that the key press should be ignored, instead of adding the letter to the field.

What the functions do is set the value of keys.up and keys.down: the “playerMoveLeft” function then reads the value of the keys to determine which keys are currently pressed. The “a ? b : c” construct uses “b” if “a” is true, and “c” otherwise. So:

  • If the up key is pressed but the down key is not, the direction is -1 + 0 = -1
  • If the down key is pressed but the up key is not, the direction is 0 + 1 = 1
  • If neither key is pressed, the direction is 0 + 0 = 0
  • If both keys are pressed, the direction is -1 + 1 = 0

So, the code above correctly responds to player input whenever the ‘playerMoveLeft’ function is called.

Another step is to define a similar aiMoveRight function for computing the AI response. We want the AI do to something smart enough to be challenging. However, since the ball moves slowly from left to right, a perfect AI could simply compute where the ball will land and wait there: it would therefore never lose. This is a good thing for computer science, but a bad thing in game: a good game AI is not an AI that always wins, but an AI that loses in a challenging and fun manner.

The following code accomplishes this:

  var aiMoveRight = function(model)
  { var dir = 0;
    if (model.ball.vx > 0)
    { var dist = (paddle.x - model.ball.x);
      var target = model.ball.y + model.ball.vy * (dist / model.ball.vx);
      while (target > field.h-ball.h || target < ball.h-field.h)
      { if (target > 0) { target = 2*(field.h-ball.h) - target }
        else            { target = 2*(ball.h-field.h) - target }
      }            

      dir =
        (model.right.pos + paddle.h < target ?  1 : 0) +
        (model.right.pos - paddle.h > target ? -1 : 0)  

      var accuracy =
        field.h * Math.exp((model.scores.left - model.scores.right)/3); 

      if (dist > accuracy)
      { dir = -dir }
    } 

    model.moveRight(dir);
  }

The general structure of the code is reasonably easy to get: the code computes the direction in which the AI (right) paddle should move, then sets that direction on the last line. If the ball is not moving right (vx > 0) the AI paddle does not move (dir = 0). However, if the ball is indeed moving right, then the computation starts: the algorithm computes the distance between the ball and the paddle. From that, it computes the point where the ball should meet the paddle (and stores the vertical coordinate in the ‘target’ variable). Once the target is known, the direction is computed: if the paddle is below the target, it moves up, and if it’s above, it moves down. This is a perfect AI: it never loses because it always knows where the ball hits and moves there immediately. So, we dumb it down by adding accuracy: if the ball is further away from the paddle than a certain distance threshold (stored in the ‘accuracy’ variable) then the paddle does the opposite of what it should do (dir = -dir).

The two important questions here are: how is the target determined, and how is the accuracy determined?

The target is computed by determining where the ball would end up if there were no top or bottom edges: one simply calculates the horizontal distance and deduces from the horizontal and vertical speed ratio the vertical distance that it will go from its current position. Once this position is known, the code takes into account bounces: if the target is outside the top or bottom edges, then it will bounce, which means that the target position should be reflected around that edge. For the bottom edge, reflection is done with ‘target = 2*(field.h-ball.h) – target’ and for the top edge it’s done with ‘target = 2*(ball.h-field.h) – target’. The AI repeats this reflection process until the target is within the playing field.

The accuracy is computed by getting the exponential of the score difference (with a few modifiers, ‘field.h’ and ‘3′, obtained through playtesting). The point of using an exponential is that, as the score difference increases (in favor of the player) the exponential increases substantially and so the accuracy also increases, and as the score difference increases (in favor of the AI) the exponential decreases substantially without going below zero. So, the point of the exponential is getting an accuracy that increases when the player is doing well, without going below zero.

As a result, the AI starts playing perfectly when the player has three more points than the AI.

The last thing the controller does is keep the model and view up to date. To do this, it computes the current time (in seconds, with decimals) and also remembers the last time it was updated. Then, if the last update is more than one timestep ago, it updates the model and increases the last-update-time by one timestep, and repeats this process until the model is up-to-date. Then, it renders the model.

  var time = function() { return +(new Date) / 1000 };
  var last = time(); 

  var act  = function()
  {
    playerMoveLeft(gameModel);
    aiMoveRight(gameModel); 

    var now = time();
    while (now > last)
    { gameModel.update();
      last += timestep } 

    gameView.render(gameModel);
    setTimeout(act,50);
  }

The time function uses a common trick: it creates a date object (which by default contains the current time), converts it to the number of milliseconds since a fixed point in time, and divides it by 1000. The ‘return’ statement means that when the function is called, it will be replaced by the returned value (var last = time();).

The ‘act()’ function applies input to the game model, then performs as many updates as necessaryand finally renders the model. Then, it asks Javascript to execute ‘act()’ again in 50 milliseconds (that’s 20 times per second).

The result is to wrap all of this in a single ‘controller()’ function and ask jQuery to execute the entire thing when the document has finished loading:

function controller()
{ var gameModel = new model();
  var gameView  = new view();
  var keys  = { up: false, down: false }; 

  gameView.avatar.play.keydown(function(event){
    if (event.which == 38) keys.up = true;
    if (event.which == 40) keys.down = true;
    return false
  }); 

  gameView.avatar.play.keyup(function(event){
    if (event.which == 38) keys.up = false;
    if (event.which == 40) keys.down = false;
    return false
  });   

  var playerMoveLeft = function(model)
  { model.moveLeft((keys.up ? -1 : 0) + (keys.down ? 1 : 0)); } 

  var aiMoveRight = function(model)
  { var dir = 0;
    if (model.ball.vx > 0)
    { var dist = (paddle.x - model.ball.x);
      var target = model.ball.y + model.ball.vy * (dist / model.ball.vx);
      while (target > field.h-ball.h || target < ball.h-field.h)
      { if (target > 0) { target = 2*(field.h-ball.h) - target }
        else            { target = 2*(ball.h-field.h) - target }
      }            

      dir =
        (model.right.pos + paddle.h < target ?  1 : 0) +
        (model.right.pos - paddle.h > target ? -1 : 0)  

      var accuracy =
        field.h * Math.exp((model.scores.left - model.scores.right)/3); 

      if (dist > accuracy)
      { dir = -dir }
    } 

    model.moveRight(dir);
  } 

  var time = function() { return +(new Date) / 1000 };
  var last = time(); 

  var act  = function()
  {
    playerMoveLeft(gameModel);
    aiMoveRight(gameModel); 

    var now = time();
    while (now > last)
    { gameModel.update();
      last += timestep } 

    gameView.render(gameModel);
    setTimeout(act,50);
  } 

  act();
} 

$(function(){new controller});

The Article

This game uses 200 lines of Javascript and 50 lines of quick-and-dirty HTML. It took me about three hours to develop. This article is the longest article so far on Nicollet.Net and took me an entire day to write. You can expect other video games and maybe even other tutorials in the future, in the same vein as this one. If you have questions, or if you have enjoyed this, make sure to add a comment below. Links to this tutorial from places where beginners ask a lot of questions are also most welcome.