Monthly Archive for January, 2009

Blog Refactoring

The old categories of “Functional Tuesdays”, “Dynamic Wednesdays” and “Imperative Fridays” are getting increasingly restrictive, mostly because they were though out quite quickly when I first started my blog five months ago. The main problem is that, while I do have things to talk about, quite often it’s a matter of one category being full of ideas and the other empty (with the inspirational category changing over time). This has obvious effects:

  • I don’t have an excellent idea for a column, so I’ll write a short article on an idea that’s good, but not awesome.
  • I can’t stay passionate about a topic for weeks, nor can I write everything in advance and still manage to fill the less passionate columns in time, which means a lot of ideas are just thrown away because they don’t fit my publishing schedule.

Another reason why I’m giving up on these is that WordPress seems to have wiped away all my categories. There are no more categories either on the public website or in the back-office, and I won’t be hunting for them in the code or database anytime soon.

Also, I won’t be changing the regular schedule: Tuesday, Wednesday and Friday remain fixed publishing days, although if I have too many things to say, I will add them anywhere during the week. This way I still have a quantity constraint (or else I won’t get anything done) and the fixed days will prevent me from writing the articles in a single sunday evening rush.

So, What’s Your Name?

I’ve been having a lot of name-related problems, lately. Not my own name, but the names of various things I have to work with. Nothing unexpected: in the computer world, we tend to use names everywhere.

Every time there are objects to be manipulated by programmers, there’s a way to give names to these objects. It could be the “id” attribute in XML, it could be the name of a file in a filesystem, it could be the name of a variable or member in a program, or it could be the hostname of a web site… A name is, in theory, a set of characters from a predetermined alphabet (that varies based on the application) which is bound to an entity of some sort. Once you hold the name, you can get the associated entity directly.

This theoretical definition leads to two distinct issues:

  • There’s only a small number of acceptable names around. Of course, generic names like “clxbpf8990″ can be used (and if you look at the ids generated by generate-id() in XSLT, this is exactly what happens) but the entire point of a name is to allow people to retrieve content based on the name. As far as the internet goes, there’s only a finite number of names any user can remember—and web users have become increasingly reliant on google and the recently added Firefox 3.0 address bar to find sites with non-obvious names. How do we handle the inevitable collisions?
  • A programmer in Australia has a global variable named Foo in a C program. A programmer in Sweden, working on a different C program, also has a variable named Foo. Should the two collide? What about machine names on local networks? User names on the same machine? Not all names are global, which means that the scope of names has to be determined, and ways of handling collisions between local scopes must be invented.

Yet, it’s not all about collision. The most important element of naming is the ability to find the data bound to a name. This is what directories are for.

The simplest form of directory is the phone directory: you have a name on the left, and a phone number on the right. The directory also happens to be sorted by ascending key order so that dichotomic search can be used to find a given name in logarithmic time (how clever—in the age of search engines, we tend to forget logarithmic search times existed since the days of dictionaries and encyclopedias). Such directories exist in the computer world. The simplest would be the classic user directory, accessed through LDAP (Lightweight Directory Access Protocol) or perhaps ActiveDirectory, but these are not the most frequently encountered by the common user. Another simple example is /etc/passwd:

username:!:100:100:office:/home/username:/usr/bin/sh

User name is on the left, various user-related information is to the right.

Domain Name System

Let’s crank the complexity lever up a little bit. What classic directory format do we use several times a day? The Domain Name System (which most of you know as DNS). Everyone sends out queries to a DNS server, asking for the IP associated with a name. That is, when you type “www.nicollet.net” in your browser, the browser needs to know what server to connect to. The problem is that looking for a server is done by routers, which are usually not very smart: they have routing tables based on masking the IP address (whether that address is IPv4 or IPv6) so they need that address to begin with. So, the browser instead resolves the domain name to an address. This involves looking at the local “hosts” file to see if a definition exists (for instance, “localhost” tends to be bound to the loopback address “127.0.0.1″), then queries any local name services the operating system may provide (this allows you to connect to an HTTP server running on another machine on your network by using that machine’s name, without having to set up a local DNS server or registering with a public one), and finally sends out a query to a DNS server somewhere on the internet (the address of the DNS server is either provided by the ISP itself, or manually entered by the user in a configuration wizard or file). The DNS then returns the address for the domain.

This is where two of my recent name problems came from. As you might remember, one week ago I had to move my blog from one server to another, and in the process, I had to change the DNS entry so that reads who typed in “www.nicollet.net” connected to the new server instead of the old one.

The first issue was that DNS propagation is not instantaneous. Back in the days when the intertubes were often clogged, caching played a big role in avoiding too many DNS queries moving up to the reference DNS server for a given top-level domain. The downside is that when you type in “www.nicollet.net”, you’re not really asking the reference DNS server for domains ending in “.net”, you’re asking the DNS server provided by your ISP, which may decide that its copy of the “www.nicollet.net” address binding is correct, without even asking the reference server (this is what caching is for). So, it took about five hours for all visitors to be correctly redirected to the new website. If any of you posted any comments, they went to the old website and were lost along with it—sorry. Of course, as you can expect, this can get a lot more problematic once you have a highly interactive website. So, you tend to choose an IP address and stick with it (or, if you have to change it, then you have it forward everything to the new one until nobody uses it anymore).

This also meant that any e-mails sent to foobar@nicollet.net were routed to the old MX binding, “mx1.ovh.net” (for those who wonder, a DNS entry contains several bindings: a web browser would look at the main binding for that domain, while a mail delivery program would look for the Mail eXchange binding) even though I was expecting them on the new MX binding, “nicollet.net”.  This took a short while to be sorted out.

The second issue was with the name of the server. See, being listed in a directory doesn’t change your name: it merely gives you a name by which you can be found. So, the name of the famous emperor of Wei is still Cao Cao, despite being posthumously named Wu. And the name of my new server still was r17474.ovh.net despite being now referred to as nicollet.net by the DNS. Then, of course, when some mail arrived for the user foobar@nicollet.net, my qmail-powered server quite naturally answered “there’s no user named foobar@nicollet.net here, please go away” in perhaps not so polite terms. So, I still had to convince my server that it was now known as “nicollet.net” so that the mail delivery could work. Nothing that the “hostname” UNIX command couldn’t solve.

What about local area networks? How come that you can access another machine on your network by using its name? There usually isn’t a central naming service which can be queried by computers on a LAN, and it wouldn’t be practical because it would have to be manually configured on every new computer. Instead, computers on a LAN use NetBIOS to set up local directories: whenever a new computer is hooked up to the network, it broadcasts its hostname and address to everyone else, and everyone else writes down the association in a local directory. Conflicts are resolved quite simply: if two computers have the same name, the last to broadcast is the one that everyone else remembers (this is quite useful if you have to move from an office to another and get a new IP as a consequence).

Names in Languages

Programming languages all provide the user with ways of naming entities that are significant, both for documentation purposes (it’s easier to understand what a value is when it has a reasonable name) and for cross-referencing data defined in one place and used in another.

On the one hand, you have the static compile-time approach. This is the easiest to work with by far, because all the nitty-gritty details are written down in the documentation of the compiler and the build scripts of your application. For instance, most compilers allow the definition of “include paths”, a list of locations on the filesystem where the definitions of objects can be found. When the compiler needs something (an included file in C or C++, a class in Java or Actionscript, a module in Objective Caml), it will look for an appropriately named file in all these locations.

Let’s consider the Objective Caml compiler, “ocamlc”. Whenever a source file contains a reference to another module (by using a symbol in a module context, such as “ModuleName.member” or “Functor(ModuleName)” or “open ModuleName” or something like that), the compiler looks for the compiled interface of that module. This is done by looking for the file “moduleName.cmi” (which is generated manually by running “ocamlc moduleName.mli”) in all the locations configured with the compiler: first, the current directory, then paths specified through the environment variables, then include paths specified with the -I command-line argument. If you’re only compiling the module (flag -c), the result is a cmo file (or cmx file with “ocamlopt”). At link-time, you must then specify all the cmo files required by the program, and the compiler resolves all links based on the name of the cmo file (for simplicity, it’s possible to group together several cmo files as a single cma file : they are simply concatenated together, so using the cma library is equivalent to adding all the cmo files it contains manually).

Java goes a short step beyond this by eliminating the need for a link step. When you import a java class, using the “import abc.def.Foobar;” statement, java understands that by looking at the relative path “abc/def/” it will find either a “Foobar.class” or a newer “Foobar.java” that it can compile to “Foobar.class”. Here, the path is specified relative to one of the include paths (which Java calls classpaths) and can be a normal filesystem path or a path within a JAR archive.

C and C++ are both the simplest and the most complex. In these languages, every reference is explicit. The first step is compilation, where files are included in other files by specifying the relative path. The second step is linking, where libraries and objects are included by specifying the relative path. At each step, additional locations can be specified to look for headers.

PHP has an interesting dual approach to things. On the one hand, its normal cross-file interpretation system consists in include() and require() statements which look for the named file in any of the specified include paths for PHP, which makes it look like SH, C and C++. On the other hand, PHP has also introduced functionality for resolving class names: when a class is used (“ClassName::Member”, “Foobar extends ClassName” or “new ClassName”) but the class is not defined, a special function is called. This lets the user specify an alternative loading scheme, such as looking for a file named ClassName.php and including it. The Zend Framework makes heavy use of this, meaning that the inclusion approach is specified once in index.php (or a common configuration file) and then no other inclusion is used for class filed (arguably, Zend_View still includes phtml files explicitly).

On the other hand, there’s runtime access. This is harder to work with, because there’s less documentation available. Sure, some concepts, such as dynamic linking, are fairly well-documented mechanisms that look for the named file in the current directory first, then in other directories specified as environment variables or system-wide configuration elements (in the case of java, in the classpath, for example). However, even then, some programs insist on working on their own.

I recently had an issue with Alfresco, namely alfresco-mmt.jar which terminated with a ClassDefNotFound exception. Looking for solutions on the internet, I found out that the class it was looking for was defined in a JAR nearby, so I adapted the classpath when running the jar so that the class could be found. Except it couldn’t. The problem with this was that the exception did not specify where the class loader had been looking for the class (or perhaps it did internally, but as an end user with no access to the source code or, probably, a debugger, I couldn’t see it), so I had no way of understanding where the JAR with the class should have been placed. It turns out, the application loaded the JAR from inside its own JAR, so I had to place it there.

An Object Philosophy : The Tools

There comes a time in every programmer’s career when the question of Object-Oriented Programming crops up. Various sources attempt to explain Object-Oriented Programming, though subtle nuances exist that often make such definitions incompatible with each other. Even worse, every programmer usually builds over the years his own version of what OOP really is about, and severe clashes can happen as a consequence.

Wikipedia’s approach is a quite pragmatical one: it accepts the existence of multiple definitions, and merely attempts to explain those concepts that are shared by most. Some define the core of OOP as the ISP, LSP, OCP, DIP and SRP acronyms principles. Others take a build-your-own approach from a set of basic premises. It only seems fair that I could get my own stab at it as well.

Object-Oriented Programming is usually used to refer to two distinct yet related concepts:

  • A set of tools that can be used for design and for programming: classes, objects, messages, inheritance, code metrics, design patterns…
  • The ways in which that set of tools may be used to construct software that guarantees certain benefits, such as low-cost maintenance or reuse of existing code.

It is impossible, and foolish, to separate the two. Using classes, objects and the other tools without proper discipline or skill does not bring any of the benefits that are associated with OOP—even worse, programmers who come expecting proper OOP use of these tools will be surprised and will often have a lot to do to correct the situation. No single Object-Oriented concept is a magic bullet capable of working wonders on its own.

Today, I’ll discuss the tools. The next article in the series will deal with the mindset.

Object-Oriented tools are language-independent and implementation-independent. Most languages have their own ways of providing support for these tools, and they can sometimes even be used in those languages that don’t provide any kind of helpful support. Considering Object-Oriented Programming as “What happens in Language X” often leads to trying to replicate language X features in language Y, even though language Y also provides support for OOP in a different way. Besides, since programming languages seldom provide the entire OOP toolset out of the box, restricting oneself to a single language also reduces the horizon of potential tools.

Objects

One thing everyone manages to agree on is that Object-Oriented Programming is about objects—what objects are, or how they should be used, is already a matter of disagreement in certain places. My own take on objects is that you don’t need to know what they are, you merely need to know how to manipulate them. This means that an OOP object can be implemented as a Java object, an Objective Caml abstract type or a C structure, as long as it behaves like OOP objects should.

First, where do objects come from? Before you can manipulate them, you must create them. The easiest way of getting a brand new object in your program is to create it yourself from scratch. Writing the string “Hello” in your source code will automatically create a string object representing “Hello” when the program is run. Most programming languages provide means of creating certain categories of objects directly in the code: strings, integers, floating-point numbers, booleans, functions and classes are typical examples. Yes, functions and classes are, from an Object-Oriented Programming perspective, objects. Some programming languages also allow the creation of arbitrarily complex objects on-the-fly as a literal (the Objective part of Objective Caml allows this, for instance), but most languages only allow the creation of complex objects from classes. I’ll get back to this class business later.

Messages

The sole purpose of objects is to receive messages.A message is a one-shot two-way communication channel between a piece of code and an object. The code sends a request to the object, and the object sends a response to the code. What the response looks like is up to the object, two distinct objects may respond differently to the same request and even a single object may respond differently to the same request at two distinct points in time. Both the request and the response carry some data—this data is usually a set of objects. A request also has an identifier that helps determine what the request is about: a “getAccountBalance” identifier means that the request is about getting the balance of an account, while a “setDuration” means that the request is about changing the duration of something.

Not all objects can process all messages. For instance, one cannot expect a rubber duck to respond to a “getAccountBalance” message. So, not being able to process a message is a possibility to be considered—one of the main advantages of having language support for OOP is that statically typed languages can check at compile-time whether all messages are going to be processed successfully. But, again, more on that later.

At this point, most OOP manuals summon forth a class definition of some form and happily declare “this class describes sets of objects, its methods define messages that the object can receive”. I find that approach quite annoying, because it forevers equates in the minds of the readers object with instance of a class and message with class method—the consequence being that readers then find themselves at a loss when they encounter languages with class-less objects (such as Objective Caml) or class-independent implementations of Object-Oriented Programming (such as Service-Oriented Architectures).

Everything is an object

So, my example is going to be one simple piece of C pseudocode instead:

int squared(int x)
{
  return x * x;
}

printf("%dn", squared(10));

In C terms, this pseudocode defines a function that returns its squared argument, then outputs the square of 10 on the standard output.

In Object-Oriented terms, this creates a new object and gives it the name squared. It does so by using a “function literal” which creates a brand new object by specifying what the object does when it receives a “function call” message: in this situation, the behavior would be to extract an integer object from the request, send that integer a “multiply by yourself” message, then respond with the result. Our example then sends the “function call” message associated with the integer literal 10 to the squared object, then binds the response to another “function call” message sent to the printf object.

In other words, a plain old piece of C code without a single occurrence of the class keyword can be interpreted as an object-oriented program. This extends even further: any piece of imperative code can be interpreted as sending messages to objects. Going even further, even pure functional code without even a hint of imperative design can also be expressed as sending messages to objects: every function is an object that can process the “function call” message.

So, in languages without classes, you can usually find:

  • Primitive objects : integers, booleans, floating-point numbers and similar kinds of values are objects which are created from literals. They accept a wide range of messages which correspond to their operators (add, multiply, subtract, boolean-and, boolean-or, and so on).
  • Aggregate objects : C structures, OCaml record types, Javascript objects (ignoring the this keyword) that are a collection of sub-objects that can be accessed by name. They support “set value X” and “get value X” messages for every member X. They are created as literals by assigning the appropriate names with the appropriate initial values. They may or may not require preliminary definition of their type (for checking purposes in strongly typed languages) and extension of their contents by sending a “set value Y” message for an yet-undefined member Y.
  • Functions : these objects accept only one message (“call function”). They are created as a literal which describes how the data is extracted from the request, what operations are performed on it, and what data is sent back as part of the response.

Semantic shifting

If Object-Oriented Programming was all about looking at previously written programs and noticing how they are Object-Oriented, there wouldn’t really be much of a point to doing it. The good news is that OOP is not only about observing objects within existing code: it can also be used to introduce brand new behaviors into existing objects. This happens by performing a semantic shift of what a message and an object is: you decide that a certain construct you just invented is an object, and that another construct is a message.

Let’s look at our C pseudocode example again:

int squared(int x)
{
  return x * x;
}

This can be interpreted as creating a new object, “squared“, which can handle the “function call” message. But it could also be interpreted as defining a new message, “squared“, which can be sent to integers. After all, since this is all a matter of interpretation, both interpretations can be correct at the same time: as long as squared(10) is a one-shot two-way communication channel between some code and the integer 10, it’s a message.

This semantic shift can be applied in almost all situations: a function with at least one argument can be seen as both an object that accepts a “call function” message, and as a message that can be sent to one of its arguments (with the other arguments as its associated data). So, now, you can go around and extend any of the objects that the programming language provides you with by creating the appropriate functions.

Polymorphism

Back to our “squared” message. That message can certainly be applied to integers, but it’s equally valid to apply it to floating-point numbers. Besides, if we (by some miracle, or perhaps some feature of the language) managed to create a new kind of object which also supported multiplication by itself (such as a complex number) then the “squared” message could also be sent to that object. This is because that message merely requires that the object it’s applied to supports multiplication by itself and so it can be applied to any such object. This is called polymorphism: when a message can be sent to any object, regardless of its nature, as long as it supports a certain set of operations. This provides the benefit of avoiding code repetition: you only define your “squared” message once, and then you can apply it everywhere. In a theoretical, ideal object-oriented programming language (the part is played here by Smalltalk, a very nice OOP language), you would just write this line once:

squared := [:x | x * x ]

Interfaces

The problem is that languages don’t play nice with the concept of polymorphism. To determine whether a program behaves correctly at runtime, the language uses a type system. This type system allows expressing certain constraints over the values, and some rules to determine which constraints imply that the program works.

The C type system is a fairly crude one. Every value has a type, with types being mutually distinct. So, you have the type of all integers, and the type of all floating point numbers. And a function argument has only one type. So, our polymorphism example, applied to two types, becomes:

int squared(int x)
{
  return x * x;
}

float squared(float x)
{
  return x * x;
}

The problem, of course, is that there is no way in the C type system to express the constraint “supports multiplication by itself”. So, although the semantics of “multiply the argument by itself” are clear, the C compiler cannot understand that it’s safe to define one such function. For that matter, the Java, C++ and Objective Caml type systems won’t let us define such a function either. This is one of the reasons I chuckle when I hear Java advocates tout Java as a “pure object-oriented programming language”: despite all the effort poured into creating a clean object model, a lot of elementary Java operations (including multiplication) are not messages, and several elementary Java types (including integers) are not objects. If you’re looking for a pure OOP language (you don’t need one to do proper OOP, of course), I strongly suggest Smalltalk.

Back to the point: languages that support OOP actively, yet have a static type system, provide the notion of interface to help with type-checking. An interface is a set of constraints and requirements, stating what messages can be received by the object, what data has to be sent as part of the request, and what data will be returned as part of the response. So, one can express an interface with a “multiply by” message, and the define the “squared” function to work on all objects with that interface: the language will then see that the interface provides the “squared” function with all it needs (namely, the “multiply by” message) and consider the types to be correct.

In Java, the interface and function would look like this:

interface Multipliable
{
  // Accepts "multiplyBy" message
  Multipliable multiplyBy(Multipliable);
}

Multipliable squared(Multipliable x)
{
  // Send "multiplyBy" message
  return x.multiplyBy(x);
}

Of course, we had to use explicit message names here, because the multiplication operator in Java is not a message supported by the type system. On the other hand, Objective Caml is a tad smarter, because it can guess the interface based on the messages being sent:

let squared x =
  x # multiplyBy x

(* squared : (< multiplyBy : 'a -> 'b ; .. > as 'a) -> 'b *)

Another consequence of Objective Caml’s more expressive type system is that it allows retrieving the correct type. For instance, if you pass an integer into Java’s “squared” function, you would get back a Multipliable object, but you wouldn’t know if it was an integer or something else. This can be annoying, because you, as a programmer, know that it’s an integer, but the language won’t let you use it as such until you remind the compiler that it should trust you (and it will still check it at runtime). By contrast, Objective Caml types are parametric. An integer would probably have a “multiplyBy : integer -> integer” function, so the squared function would correctly return an integer.

But all of that is idle bickering about type systems that deviates from the purpose of the article.

Classes

A very important element of object-oriented programming is the notion of class. Earlier on, I said that objects could only be constructed at runtime by literals. That is, you create a new object and set its various properties right away. This, of course, makes it difficult to define categories of similar objects, because you have to define them one at a time and also make sure they share similar properties.

Another way of creating objects has been proposed in several programming languages, to the point of becoming (mistakenly) a synonym for OOP. A class is an object—even though some impure programming languages do not treat it as such—which defines a single message called a constructor. Upon receiving a constructor message, the class creates (by some arcane magic implemented by the compiler) a brand new object with a predetermined set of properties that are a combination of the data sent with the constructor message and the data provided when the class itself was created.

The exact syntax for sending the constructor message varies from language to language, but usually involves something along the lines of “new Classname(arguments)“.

In practice, a class defines two things:

  • The data to be carried by the object. This data is either created from scratch by the constructor, gathered from the global scope, or extracted from the constructor message itself.
  • The list of messages that the object can process, along with the code to perform said processing.

Again, the exact syntax for defining a class varies. In Objective Caml, the constructor message is described as part of the class name and used to initialize the values in the object, while the messages to be processed are called “methods”:

class vector2d (x,y) =
object
  val x = x
  val y = y
  method length = sqrt(x *. x +. y *. y)
end

let v = new vector2d (1.0,2.0) in
  v # length

[Edit : corrected small typo] In Java (as well as C++ and C#) the constructor is described as if it were a method, which can sometimes be confusing. Also, the language does not automatically guess which interfaces are implemented by a certain object, so they have to be specified as part of the class definition.

class Vector2d
{
  private float x;
  private float y;
  public Complex(float x, float t)
  {
    this.x = x;
    this.y = y;
  }
  public float Length()
  {
    return Math.sqrt(x * x + y * y);
  }
}

Vector2d v = new Vector2d(1.0,2.0);
v.Length();

In Javascript, classes are constructors, with methods being defined as part of that constructor’s prototype:

function vector2d(x,y)
{
  this.x = x;
  this.y = y;
}

vector2d.prototype.length = function()
{
  return Math.sqrt(this.x * this.x + this.y * this.y);
}

var v = new vector2d(1.0,2.0);
v.length();

Class extensions

In fact, classes support another message besides constructors: inheritance, or extension. When extending a class or inheriting from a class (two ways of expressing the same concept), one create creates a new class which imports all the contents of another class, then adds its own set of contents. Most type systems apply a subtype relationship to inheritance, meaning that a function which can operate on an instance of the original class can also operate on an instance of the inheriting class.

Inheritance is a strange beast: on the one hand, it behaves a little like interfaces (and is in fact the only way of replacing interfaces in languages that don’t have them, such as C++) to provide polymorphism but on the other hand it is a non-polymorphic way of extending the functionality of a class. It happens quite often that entire programs avoid using inheritance at all, because interfaces solve all their polymorphism needs and they do not need the extension features of inheritance.

A summary

Object-oriented programming introduces the following set of concepts:

  • An object is a value manipulated by the program. Integers, booleans, strings, functions and classes are examples of objects. Objects are usually created using literals provided by the programming language.
  • A program works by sending messages to objects. A message transmits a request (with arguments) to an object and the object responds with some data. Calling a function on an argument can be seen as sending a message, and calling a member function or method can also be interpreted as sending a message.
  • An interface, in a statically typed programming language, represents a set of messages that an object must be able to process. This is used to ensure that a function argument can be used without causing a runtime error.
  • A class is a special object which can receive constructor messages. Every constructor message constructs a brand new object that is returned as part of the response. Usually, an object created from a class is said to be an instance of that class.
  • Classes can inherit from one another, which can be useful in some cases to extend its functionality in a non-polymorphic manner.

The next article in the series is here (starting on January 9, 2009).



680 feed subscribers
(readers who polled a feed this week)