Monthly Archive for February, 2009

Documentation!

First, let’s see a short example. Consider Magento’s addImageToMediaGallery function, part of the product model class. Here is the complete documentation for that function:

Add image to media gallery

  • access: public
void addImageToMediaGallery (string $file, [string|array $mediaAttribute = null], [boolean $move = false], [boolean $exclude = true])
  • string $file: file path of image in file system
  • string|array $mediaAttribute: code of attribute with type ‘media_image’, leave blank if image should be only in gallery
  • boolean $move: if true, it will move source file
  • boolean $exclude: mark image as disabled in product page view

It’s better than most documentation in Magento (and by that, I mean that it actually exists). It’s still not very good. A few shortcomings are:

  • I don’t need a sentence to tell me that a function named addImageToMediaGallery adds an image to the media gallery of the current product. What I need to know is what the media gallery is (a link to a documentation page about the media gallery could be useful), how I can list all images or remove an image, and so on. Some of these can be guessed (such as getMediaGalleryImages, which I suspect lists all images added to the media gallery so far, or does it?) but others cannot.
  • Telling me that the first argument is a file path is not very helpful, because it does not tell me what a relative path will be relative to. My initial guess would tell me that Magento will use the current file path to resolve relative file paths (just like any other file-using function does), but I would be wrong: developer forum posts tend to indicate that the path is relative to the media/import path within the Magento install instead.
  • Besides, this also does not tell me what happens when I name a file that does not exist. I can guess that an exception is thrown (but I don’t know which exception, so I cannot programmatically handle it) or that the script just dies.
  • Now, media attributes? I don’t know what a media attribute is (or, more precisely, what is the list of usable media attributes). Can I use any value returned by getMediaAttributes and expect it to work? How do I add or remove media attributes, access them from a template, or simply know what media attributes are associated with a default product? Again, a quick link to a concept page, such as “What are media attributes?” would be helpful.
  • Even assuming that I provide a media attribute argument, what will happen? I can guess the media attribute will be bound to the selected image, but is that all or will something else happen? Can I bind several images to the same attribute? What happens if I unbind an image, does it get deleted? How can I unbind images anyway?
  • Moving the image. What happens if it cannot be moved (for instance, no write-access on the original file)? Where is it moved to? Is the operation failsafe (so that if an exception is thrown after the image is added but before the product is saved, the image is not moved) or does it happen right away and lets me deal with keeping things transactional?
  • Why is the image only disabled in the product page view? Does that mean in other places, such as the catalog or search result views, the image is enabled?

When I’m working on code that I want to be safe and robust, these are only a few questions that I ask myself. And this happens about every single function I work with. If the documentation isn’t up to speed on those questions, I’m going to either lose time tracking down the answer on forums and/or in the source code, or press on without an answer and possibly deliver flawed code that doesn’t handle exceptional situations correctly.

As a contrast, consider .NET’s MSDN page for HttpWebRequest.GetResponse. First, it’s too long for me to post it here. Its first section is the equivalent of the entire above Magento documentation: method syntax, parameter types, return type, description of parameter functionality and return value. Then, there’s the section on exceptions, which describes everything that could possibly happen while executing that function and the reason for it happening. Then, there’s the section on remarks explaining various important details such as references to other parts of the library that could be helpful (such as cookie containers), other methods of the object that alter or depend on the calls to the method, words of caution about the use, and so on. And then, there’s a section with typical use examples explaining quickly how the method is expected to be used.

In a similar vein, the PHP documentation itself is also quite detailed. Consider, for instance, array_key_exists. Like MSDN, it contains syntax and basic description, failure case descriptions, general remarks and examples. It also includes user-provided comments about the function that may be useful to developers.

Writing Documentation

It’s not easy. It requires three skills that developers are not necessarily familiar with. First, good documentation is an act of communcation: it must be concise yet explicit and detailed, it must be complete and technical yet simple and clear. It must be written properly, with complete sentences that one can read easily, use simple words where it can and the precise technical or complex word where it must. It must avoid redundant information (such as repeating as-is the name of a function).

Second, good documentation must deal with all the issues at hand. This is difficult because writing the documentation involves deep knowledge about the general design of the software and, in many cases, of the underlying implementation, so that the technical writer may not be aware that the reader will not know what a media gallery is or where more information can be found about it. Adding a search engine to the documentation can be helpful, but it will be of no use if nobody thought about writing a page on media galleries and their basic principles. Documenting code is explaining what is obvious, but thinking about what is obvious is difficult since so much of it seems just natural. In a similar vein, context-dependent behavior (such as relative paths depending on a certain absolute path) should be identified and be made explicit. On the other hand, there are also the parts that nobody thinks about, such as failures. One should always specify in the documentation when a function may fail, and what happens when it fails.

Third, good documentation should serve three different kinds of requests.

  1. I just saw this function in the code. How does it work? Why doesn’t it work? This is the kind of question that a maintenance programmer will ask himself, and information about behavior, invariants and failure cases is always useful.
  2. I need to do this. Will this function help me? This is what the developer wonders: he has a task, and is looking for a solution. Either the function does what he wants and includes a quick example (and tells him about all the pitfalls of doing it this way) or it references another page that does it.
  3. So, how does this entire thing work anyway? This is what the architect wonders. The detailed information, he needs not. The bigger picture, such as the main concepts, the general properties of a module or class, he craves.

Whenever you find yourself writing some documentation, always determine whether all three requests are satisfied accordingly. Include troubleshooting and debugging information, include how-to and tutorial information, and include bigger picture and general design information.

Quick MySQL Identifier Escaping

A quick code snippet, in Objective Caml, that illustrates escaping an identifier in MySQL. Identifiers are quoted between backticks (`) and backticks within identifiers are escaped by writing two adjacent backticks (“). Escaping identifiers mean turning “Hello`World” into “`Hello“World`”.

This function turns an identifier into a quoted identifier, with a single memory allocation. It uses stack space that is linear in the number of backticks to escape within the identifier (which is at most 256, given traditional MySQL limits).

The code is in the public domain.

let string_of_name name =
  let rec escape c b i =
    try
      let next = String.index_from name i '`' in
      let str = escape (c+1) next (next+1) in
        String.blit name b str (b+c+1) (next-b+1) ;
        str
    with Not_found ->
      let size = String.length name + c + 2 in
      let str = String.create size in
        str.[0] <- '`' ;
        str.[size-1] <- '`' ;
        String.blit name b str (b+c+1) (String.length name - b) ;
        str
  in escape 0 0 0
;;

Lazy and Threads

A nice thing about pure functional programming is that it’s inherently thread-safe. Since your code never actually modifies anything, atomicity of operations ceases to be an issue. At the hardware level, every modification operation is performed by a single thread (since it happens while a value is created, but before the high-level semantics can manipulate it) so no error can arise from multiple threads. Every value is basically written by one thread then read by everyone else until it dies.

Except that isn’t always the case. I’ve recently stumbled upon a nasty interaction between Objective Caml threads and pure functional lazy evaluation.

Lazy evaluation is the process of delaying the evaluation of an expression until it’s needed (so that, if it’s never needed, no time is wasted evaluating it). This can be done in Objective Caml through the use of the “lazy” keyword:

let text = lazy (String.concat "," values)

This example delays a costly operation (concatenating a list of strings) until it’s required. Evaluating it requires applying a specific evaluation function:

print_string (Lazy.force text)

The first time a lazy value is forced, it is evaluated. Subsequent reads merely return the already evaluated value. While the semantics of this operation are fairly simple and involve no side-effects, the implementation of it does involve side-effects. Which is why I was very cautious of whether such a mechanism could be used in a thread-safe manner.

Imagine that two threads are provided with an un-evaluated lazy value. Both threads evaluate it at the same time (either on different processors, or on the same processor with time-sharing). Can the evaluation process result in anything other than the evaluation returning the correct value in both threads? It can. There’s a situation where the evaluation can raise an “Undefined” exception (which is quite likely if the evaluation takes a long time) and a less likely situation where the evaluation can cause a segmentation fault or even constitute a security flaw by executing arbitrary code.

The Implementation

The code that handles lazy evaluation looks like this (this is a direct excerpt from the codebase of Objective Caml 3.11.0, any copyrights apply):

let force_lazy_block (blk : 'arg lazy_t) =
  let closure = (Obj.obj (Obj.field (Obj.repr blk) 0) : unit -> 'arg) in
  Obj.set_field (Obj.repr blk) 0 raise_undefined;
  try
    let result = closure () in
    Obj.set_field (Obj.repr blk) 0 (Obj.repr result);  (* do set_field BEFORE set_tag *)
    Obj.set_tag (Obj.repr blk) Obj.forward_tag;
    result
  with e ->
    Obj.set_field (Obj.repr blk) 0 (Obj.repr (fun () -> raise e));
    raise e
;;

Lazy.force inlines a piece of machine code that tests whether the lazy object is already evaluated (its tag is Obj.forward_tag) and, if it isn’t, calls force_lazy_block. The latter (shown above) extracts the evaluation closure, calls it to get the result, and turns the lazy expression into an evaluated object (setting its zeroth field to the result and its tag to Obj.forward_tag). If the expression raises an exception, it remains un-evaluated but turns into a function that throws the exception, so that side-effects are not repeated.

Two race conditions appear here.

First Race Condition

The first is caused when a thread forces the evaluation of a lazy value while another thread is already evaluating it. In order:

  • Thread A forces the evaluation.
  • Thread A sets field 0 of the lazy block to raise_undefined.
  • Thread A starts running closure ().
  • Context switch!
  • Thread B forces the evaluation.
  • Thread B runs closure (), which is now raise_undefined.
  • Thread B gets Lazy.Undefined.

This behavior can be readily achieved with a three-line test:

let expr = lazy (Thread.delay 1.0)

let _ =
  let thread = Thread.create (fun () -> Lazy.force expr) () in
  Lazy.force expr; Thread.join thread

The evaluation of the lazy expression explicitly forces a context switch, which causes the second thread to die with the Undefined exception. The same would happen if the runtime forces a context switch during the evaluation of the closure (which is usually a long computation that might warrant it).

Second Race Condition

This one is harder to illustrate due to the nature of context switching in the middle of  a standard library function. It happens if a context switch occurs between the instruction that sets the zeroth field to the evaluated result and the instruction that sets the tag to Obj.forward_tag. Such a switch results in a normal lazy block, which has an arbitrary value as its zeroth field. If another thread attempts to force the value, the runtime will treat that arbitrary value as a closure and execute it. This, in most situations, means death of the program, or perhaps the execution of arbitrary intruder-chosen code if the lazy value is chosen appropriately.

Google Disambiguation

Right now, every name on the web can be owned by only a handful of entities. Sure, you can play around with registering only one of “.com”, “.net”, “.org” (and “.fr”, and “.biz”, and so on) and letting other entities have the other domain names. But seriously, what large company today would not consider registering all of the top-level domains that it can get away with, along with national top-level domains?

So, what will happen when there won’t be enough names anymore? The good news is that Wikipedia solved that problem already: when several things have the same name (as often happens in real-life), Wikipedia provides a disambiguation page that allows choosing between the various options in a short and concise manner. How long until the leading web authorities incorporate a disambiguation scheme to let users choose between several options that all have the same domain name? It would certainly take some effort to get right (the entire DNS system would probably have to be re-hauled) but with Facebook getting control of the mail system and Stack Overflow eating up Usenet simultaneously it shouldn’t really be a problem. Would it?

Just Kidding

Ha ha, only serious.

HTTP-friendliness

When I use a website, I expect it to support the basic ideas of the HTTP protocol. Otherwise, I will not be able to use my browser to the full extend of its “back”, “forward”, “refresh” and “bookmark” capabilities. This means:

  • GET requests return data without any important side-effects, so that I can refresh or bookmark that page and still get the same results without any risk of repeating an operation or getting some entirely different piece of data.

A GET request is named GET for a reason. If it was meant to post data to a website, it would have been called POST. No, wait, there’s also a POST method for that. Once you see a GET request as a read-only access to a named resource (with the URI being the identifier) you get the full power of HTTP support from browsers. When I visit a certain URL as GET, I expect it to show me the resource corresponding to that URL or a newer version of that resource or a redirection to a page explaining why I cannot access it and how I can (if I can) get to access it.

  • POST requests return a page that’s a summary of what they did, and should make it clear that refreshing the page will repeat that operation.

No displaying of a GET-like page that gives the illusion of being a read-only request. In fact, don’t even give your users the impression that a POST request is somehow intended for reading data from your server. A POST should be dedicated entirely to posting data, with the courtesy response intended to provide concise information about how it went and what to do next. If you don’t have anything interesting to display that is directly related to the operation you just did, redirect to a GET.

  • HTTP has no status. Sure, you can get some status by using cookies (client-side) combined with sessions (server-side). But keep in mind that an user may have several windows opened on your website, and there may be long durations (or even computer changes) in-between visits.

A very common (and annoying) thing to do is use sessions to add status where status should not be present. Especially if you have AJAX to take care of most persistence needs you could come up with in a client-side fashion. What I usually accept as status is authentication information (because this is the elementary point of having a session in the first place) and information about what the user has recently done (such as displaying back error messages on a form, or a success message, on the GET page you end up on after visiting a POST). The good news is that most develpers have been bitten often enough to avoid placing important information in the session, precisely because they’ve been hit hard when the session went away.

My ideal website provides me with a list of resources I can read, and each resource is identified by an URL that I can send a GET to. A session mechanism may alter what I see when I visit a resource, for instance if I’m not authenticated or if I don’t have the necessary permissions. Then, if I am expected to interact with the contents of the website, I am provided with POST URLs that redirect me to a GET URL that makes sense. A login form would be a nice place for a POST (since it registers a session) while a search form would be a nice place for a GET (since a search page is a read-only resource that needs no server-side modifications). If I need some smart state-based behavior to occur, it should be provided to me as AJAX, with the implication that it won’t persist if I leave the page (the “you cannot save web pages” principle has been fairly well assimilated by users). Informational state, such as error messages, can be provided to pages by my session data.

Cooking!

This post has absolutely nothing to do with computers. Besides, my cooking english certainly isn’t as well-trained as my technical english, so you might want to steer clear.

Either way, I’ve uploaded a quick chicken-and-cheese recipe I invented. You can find it here : Mozzarella Schnitzel.

Hacking Magento

My evil hacker side is rampaging the virtual countryside again. This time, I’m scanning Magento for exploits and vulnerabilities. 

If you like what you see here, or if you’re interested by more details about Magento, the web or the business of earning money online, make sure you subscribe to my rss feed to keep up with the latest articles on the topic. 

Anyway, let’s start with the easy stuff.

Eval

Once I download the code, the first step is to look for classic bug-prone functions. One example is the ‘eval’ function, which executes an arbitrary string as PHP code. Were such a function present in the codebase, I could look for ways of subverting the input string so that I can insert my own code in there and take control of the server.

A quick search of the code yields only two uses of ‘eval’, both of them in a google cart function that was deprecated because it was using ‘eval’:

if($value == "true" || $value == "false")
  eval('$this->'.$string.'="'.$value.'";');
else
  eval('$this->'.$string.'="'.$default.'";');

I scan for uses of that function (just in case someone ignored the deprecation) and get no results. Well, that particular exploit won’t be available here.

Exec

Another way is the classic family of shell execution functions: ‘exec’, ’shell_exec’ and ‘passthru’, as well as the backtick operator that I’ve never actually seen used anywhere. These functions take a string argument and run it as a command on the server. Of course, this requires that the server is not secure and allows arbitrary execution of commands, but at least one server on the internet is bound to have this safety issue and run Magento.

So, if I could then corrupt the arguments to that call, I could have the server run what I want (usually, it would be downloading a PHP file from my own evil server and running that file with a direct query).

The basic ‘exec’ comes up as part of PEAR, mostly with constant string arguments, so no cookie there.

As for ’shell_exec’, it comes up in Zend for the console adapter (that no sane person would use on the web), also with constant string arguments.

Finally, ‘passthru’ does not come up anywhere.

So, there’s nothing this way either.

SQL Injection

If I can’t take control of the server directly, I could at least get into the site admin, for instance by extracting the admin password from the database (or inserting my own in its place, if it’s encrypted). With access to the back-end, I could upload evil PHP files and get control anyway. So, I could try hammering the database with injection requests.

A quick search for “SELECT … FROM” yields no interesting results (all of them are within Zend, and I’m not going to look for exploits within Zend today). This means that Magento is using Zend for handling requests (by use of Zend_Table and the related functions) in order to reduce the probability of SQL injection. So far so good, but even Zend doesn’t eliminate the risk of SQL injection completely.

For instance, Zend relies on providing variables as arguments to its functions so that it can escape them itself. So, one would do (to build the ‘where’ part of a query):

$select -> where('parent_id > 0 AND user_id = ?', $userId);

But looking at a Magento file (one that’s part of the external API, and handles the users to the API) I find instead:

$select -> where("parent_id > 0 AND user_id = {$user_id}");

This code inserts the text value of $user_id directly into the request without any escaping or even checking, which makes it a possible vulnerability against SQL injection. This is getting excited: can I alter $user_id to get the request to do nasty things? Nope. Even though the SQL statement itself is risky, the variable is protected:

if (is_numeric($user)) {
  $userId = $user;
} else if ($user instanceof Mage_Core_Model_Abstract) {
  $userId = $user->getUserId();
} else {
  return null;
}

There are around 90 occurences of a “where” clause that contains an interpolated string within Magento. Every one of them is a potential security issue. All of them seem to be secured by argument verification, though.

Password Retrieval

Another way of gaining access to the administration panel is simply by getting the password. Joomla! had a vulnerability in this area not so long ago, for instance. Magento uses a fairly straightforward controller dispatch scheme, meaning that the “/admin/index/forgotpassword/” URL maps to functoin “forgotpasswordAction()” in the file “AdminHtml/controllers/IndexController.php”.

Peeking at the code for that function, I soon notice there’s no way I can get through. Unlike the Joomla!, the password is not set by the user, but rather re-generated by the server and sent back to the user. I can’t even insert my own email to receive the password: sending happens using a specific function that uses the user’s mail.

$user->sendNewPasswordEmail();

Another technique would be to somehow predict what password was generated by the server and plug it back in to connect. The password is generated as such:

$pass = substr(md5(uniqid(rand(), true)), 0, 6);

Now, that’s quite interesting. The server first generates an md5 hash: the characters inside the hash are fully random and unpredictable (unless I can somehow identify the initial state of uniqid and rand when I performed the re-generation, but that was designed to be impossible). Then, it selects the first 6 characters of the hash and uses them as the password. This means that the password contains six hexadecimal figures: there are 16 million possible passwords there, which is far weaker than the safety of a 6-character alphanumeric password (64 billion possible passwords) and ridiculous when compared with an 8-character password containing digits, numbers and punctuation (up to 70 million billion possible passwords).

Of course, this is nothing groundbreaking: 16 million possible passwords is plenty to be safe, especially since they’re randomly chosen and therefore impossible to guess without full brute-force. Besides, to do it, you would need to know the administrator username and email (which can be obtained through a minimal amount of social engineering).

Either way, an improved password-generation method would be to use base64_encode to generate alphanumeric passwords instead of just hex passwords like the above:

$pass = substr(base64_encode(md5(uniqid(rand(), true), true)), 0, 6);

This brings back the number of possible passwords to 64 billion, which is beyond brute-force.

This doesn’t eliminate the annoyance of changing the password without a confirmation e-mail: as soon as you know the administrator’s mail, you can generate a new password as often as you want, and you can even do it fast enough to make the “read password from mail and write password in box” process too slow to use the latest password, or even have the mail-sending script burst (because it’s blacklisted for flooding, for instance) and leave the user with no password.

Related Posts

Persistent Data, Again

As a follow-up on yesterday’s discussion of Objective Caml as a web language and to reminisce an earlier post about persistent data, I will discuss today an example of Objective Caml interface for representing persistent data.

First, what’s persistent data? Its name implies that it persists (for instance, between program runs). In the web domain, persistent data is data that persists on the server in-between requests. This is equivalent with the former definition if every request runs a program, but distinct if a single program serves all requests. Even then, however, it’s interesting to have the data actually persist in order to resist crashes and maintenance operations that would take the server down, or simply to allow transferring the server from one host to another.

Almost all dynamic web sites use some form of persistent data. Most of them rely on a database (such as the MySQL part of LAMP) and the file system. The three most important concepts related to persistent data are:

  1. How robust it is. Caches tend to be persistent, but data inside can be destroyed at any moment because it’s known to be recomputable. Volatile information, such as session data, needs to survive in-between requests, but is only as safe as the user’s browser cookies, so it is acceptable to destroy it if the server shuts down. Persistent information is expected to stay around once it’s written, because it cannot be retrieved, but extreme situations wiping off some of this data can still be acceptable. Critical information, especially if it has legal importance, should not be lost regardless of what happens.
  2. How fresh it is. Sometimes, it’s extremely important for data to be fresh: a financial transaction shouldn’t be based on data from two transactions ago. Other times, freshness is less essential: a thirty-second or one-minute latency on friend status updates is less critical. This affects whether the system can use read-only mirrors instead of reading from the core storage of the system,  whether locks have to be placed on certain tables and whether transactional isolation levels are high or low.
  3. Who can access it. Database systems provide systems from restricting access to various parts of a data structure, but they are often not granular enough, nor are they actually used to implement permissions for practical reasons (users are per database system, not per database, and you don’t want a program running on a single database to pollute the user space of the entire system). Some data can be read by anyone, some data can only be written by its owner or a super-administrator, some data can only be read by its creator and their friends. Most systems implement an authorization system on top of the database to handle this.

In addition to this, there’s a variety of optimization questions such as storing files on the filesystem or the database, what columns to index, and so on. These don’t affect the use of the data: the behavior is the same and only the performance changes.

How I want data to appear and be used in a program is the following:

Declaring some data structure, such as a read-only persistent string (the write operations are assumed to be performed inside the module that declares the structure) that is always fresh. I also assume that the module Persist was created from a functor that allows specifying the permission system to be used.

val admin_email : string Persist.value_readonly_fresh

Defining the data structure. This defines the actual type of the object (for instance, a read-write object instead of a read-only object), an unique identifier that  helps match the persistent data with a value in the database, a serialization object that explains how the data should be serialized and also helps the type inference algorithm deduce the full type of the data, and authentication information for reads and writes. The creation is not a side-effect: it merely creates a new database from an old one, including the new piece of data.

let database, admin_email =
  Persist.value_readwrite_fresh
    ~where:  database
    ~what:   Serialize.string
    ~name:   "config/admin-email"
    ~read:   (fun me -> true) (* Anyone can read the admin email. *)
    ~write:  (fun me -> me = Auth.Admin) (* Only admin can change it. *)
    ()

Using the data for reading and writing the admin email. Accesses never have side-effects: instead, they manipulate a context object that represents the state of the database. When modifications are done, the context can be commited to the database.

let ensure_admin_mail me context mail =
  try
    (* Try to see if a value is present. *)
    let _ = admin_email # get me context in context
  with Persist.ValueNotSet ->
    (* Change returns the new context or raises access exception *)
    admin_email # set me context mail

Once the database context has been modified by the context, it is committed. The expected behavior is that all changes are applied to the database and become visible to any new contexts (the old contexts still see the old value). If any of the data that was read from the context as ‘fresh’ has changed since the context was created, then it means the data was not actually fresh, which results in an exception being thrown (the expected behavior, then, is to repeat the request). In practice, if locking rules are correct, freshness errors should never occur.

let () = Persist.commit database context

Alternative persistent types would exist : sets (unordered lists of values with “insert”, “delete”, “exists”, “count” and “select all” queries) and maps (unordered key-value associations with “get”, “set”, “remove”, “count”, “find”, “find all” and “select all”).

All of this begs the question of how the database and context are transmitted. The database has to be transmitted at module initialization time so that the persistent values can appear as part of the module interface. This means that the module is in fact a functor. Something like this:

module type DATABASE =
sig
  val structure : Persist.database ref
end

module Admin :
  functor (Database : DATABASE) ->
sig
  val email : string Persist.value_readonly_fresh
  val ensure_mail : Auth.me -> Persist.context -> string -> Persist.context
end

Then, the context itself is passed around as argument-and-return-value, as usual in functional programming.

OCaml Web Sites

Objective Caml’s use as a web programming language is very limited. It’s orders of magnitudes smaller than the mastodons that are PHP, Java and ASP.NET, and also considerably smaller than Perl, Ruby and classic CGI solutions.

Why?

Hosting Availability

I’m discussing here only the cheap hosting options provided to the vulgum pecus. Since people at hosting companies don’t see Objective Caml as a credible web language, they are not going to spend precious time providing support for it on anything but user-maintained dedicated servers. Arguably, this one affects other languages as well. Looking at my host’s  prices:

  • PHP is available around €2/month
  • Perl/CGI/Python start around €4/month
  • Anything else needs at least €12/month

Sure, I can look for .NET hosting elsewhere (starts at €4/month) or a Java-Tomcat combination. But there are as of now no “Objective Caml is installed and you can just upload your software” services. The only one that’s commonly available is CGI, but it requires compiling the program natively (since there’s no runtime), which requires a lot of knowledge that’s just not needed for PHP.

This means your average web beginner who wants a website will not have access to much beyond PHP.

Beginner-Friendliness

In this regard, PHP is the easiest by far (along with Perl or Python, but these two have fewer highly targeted tutorials available online). You don’t have to launch an IDE to manipulate your code, you don’t have to run a compiler, you just open a file in your text editor, upload it to the server, and visit the URL again.

This method has massive downsides. First, it’s not practical for any kind of industrial development (you would have to move to an IDE, use source control, and set up a deployment strategy beyond just uploading the files). Second, the absence of compiling means a lot of errors can filter through and only manifest when a certain set of conditions appear, even though it’s a stupid error that a classic type system could have caught.

But it’s practical. The cost necessary for starting to work with PHP, or to hack through a small bit of existing PHP code, is orders of magnitude smaller than that of other languages including Objective Caml.

Web Frameworks

Ruby has Rails, PHP has the Zend Framework and CakePHP, C# comes with ASP.NET, and Java has (among others) JSP and JSF. What does Objective Caml come with?

  • There’s no universal mapping system that would map Objective Caml values to a database. You have to work with specific tools: POCaml (postgreSQL), OCaml-Mysql, and so on. Arguably, that’s not so bad.
  • There’s no universal HTTP handling. On the one hand, there’s OCaml HTTP, which runs a daemon (can be nasty on non-dedicated hosting) and is GPL software. On the other hand, there’s mod_caml for Apache, which benefits from existing Apache installs but runs every program as a CGI script with no persistence outside of the database. And then, there’s godi-ocaml-http.
  • A lot of the additional functionality provided by typical web frameworks is missing. Take a look at the Zend Framework or Rails to see the amount of web-oriented functionality some web frameworks manage to cram.

In short, there’s not a lot to see here.

What If?

What would happen if a compact web framework were proposed? One that, in addition to borrowing existing useful concepts from other languages, also added some OCaml-specific features to the mix. Functional modules would be an interesting addition, so would be the type system and pure functional programming applied to transactions, and monadic optimization at initialization time would also be quite interesting.

The problem of convincing young web beginners to use the language would still exist, but at least professionals could look upon it with a little bit more seriousness. Especially if some standalone high-quality pieces of web software written in Objective Caml hit the market.

So, where do we begin?

PHP Dynamic Method Definitions

I recently stumbled across a quite old post about using mixins in PHP. It observes that calling a member function within another member function of another object is possible, and runs that function with the $this variable of the other. Like so:

class Alpha {
  function called() { echo $this -> _beta; }
}

class Beta {
  var $_beta = "Hello!";
  function caller() { Alpha::called(); }
}

$beta = new Beta();
$beta -> caller();

This example displays “Hello!” by using the $beta object as the $this variable in Alpha::called. The author of the post then wants to achieve something similar to this:

class Beta {
  var $_beta;
}

$beta = new Beta();
mixin($beta, Alpha);
$beta -> called();

The mixin call would insert all functions from class Alpha into class Beta, thereby allowing a direct call to called despite it never being defined in Beta. If you will, the mixin behaves like importing all the functions from a class into another. And unlike inheritance, it works dynamically.

Except that there’s no way in PHP (as of 5.2, at least) to add a function to an object. The details of why it’s impossible are complex (and can be summarized as “whatever your approach, there will be a problem that cannot be worked around”). Therefore, adding functions to objects is mostly a matter of choosing the approach that has the most acceptable problem.

My choice is to have classes extend a certain base class before they can receive functions dynamically. This class looks mostly like this:

class Dynamic
{
  function __call($func, $args)
  {
    $func = strtolower($func);
    $assoc = $this -> funcs[$func];
    if (is_object($assoc))
      return call_user_func_array(array($assoc,$func), $args);
    if (!isset($assoc)) $assoc = get_class($this);
    $argarr = array();
    $keys = array_keys($args);
    foreach ($keys as $id => $key)
      $argarr[] = '$args[$keys['.$id.']]';
    $argstr = implode($argarr, ",");
    return eval("return $assoc::$func($argstr);");
  }

  function import($arg1, $arg2=null)
  {
    assert (is_object($arg1) || class_exists($arg1));
    if (isset($arg2))
      $this -> funcs[strtolower($arg2)] = $arg1;
    else
      foreach (get_class_methods($arg1) as $method)
        $this -> funcs[strtolower($method)] = $arg1;
  }
}

Any errors are left to the reader as an exercise ;) This code allows you to add functions from other objects, and functions from other classes, to your class:

class Alpha
{
  var $_value = "Alpha";
  function show() { echo $this -> _value; }
}

class Beta extends Dynamic
{
  var $_value = "Beta";
}

$alpha = new Alpha();
$beta = new Beta();

$beta -> import(Alpha); // or (Alpha, show)
$beta -> show();        // "Beta"

$beta -> import($alpha); // or ($alpha, show)
$beta -> show();         // "Alpha"

If you import an object, then all its functions are installed into the dynamic object, and calling these functions actually calls the functions on the imported object. Similarly, if you import a class, then all its functions are installed into the dynamic object, but calling them will run them on the dynamic object. And you may add a second argument to only load a single function from the object or class, instead of loading them all.

Two important things to notice here:

  • You can’t override __call in classes that extend Dynamic. Unless, of course, you forward the call to the parent class if you cannot handle it.
  • Functions defined directly in the class, as well as those handled by its __call function (if any) are never overridden by importing. You can only override functions that don’t exist (or were imported from somewhere else).

Enjoy! I hereby place all this code in the public domain.



693 feed subscribers
(readers who polled a feed this week)