Tag Archive for 'PHP'

Frameworks, Libraries, Conventions

Funkatron came up with the MicroPHP Manifesto :

I am a PHP developer

  • I am not a Zend Framework or Symfony or CakePHP developer
  • I think PHP is complicated enough

I like building small things

  • I like building small things with simple purposes
  • I like to make things that solve problems
  • I like building small things that work together to solve larger problems

I want less code, not more

  • I want to write less code, not more
  • I want to manage less code, not more
  • I want to support less code, not more
  • I need to justify every piece of code I add to a project

I like simple, readable code

  • I want to write code that is easily understood
  • I want code that is easily verifiable

Without surprise, a large swath of the community did not take it well, for similar reasons to my earlier piece against Zend Framework — deviation from the commonly accepted norm.

I have come a long way since I wrote that article, and I must have been walking in circles, because I actually ended up where I originally begun : why do we call these things frameworks ?

Zend, Symfony, CakePHP — as well as Node.js, Rails, Django, Ocsigen … — actually contribute three different things to projects that use them.

Libraries

A library provides functionality used for solving general problems in a flexible, standalone manner. Zend_Mail is a classic example of the library aspect of Zend Framework: you can plug it into your application and start sending e-mail. The interface you would use is uncluttered by details that are not directly related to sending e-mail.

The core qualities of a library are its power (how many different aspects of a problem does it let me solve — attachments, rich text, bouncing, MIME handling…) and the clarity of its interface. What problems can you solve, and how fast can you solve them?

Conventions

When you hear «conventions» you immediately think of opening brace positions and variable naming rules. It’s about more than that.

The Model-View-Controller separation is an example of convention: it has been decided that under no circumstances should HTML rendering occur in Model code, no HTTP or session handling should happen in View code, and no SQL queries happen in Controller code.

Good conventions are designed to let the developers assume interesting properties about the code without having to actually read it. A convention like «no global variables» means I never have to care about global state in my code, ever. A convention like «view code must respect the law of Demeter» means all the data used by the view is right where it is being initialized.

They are also designed to make reuse and interoperability easier by reducing the number of ways in which a possible interface can be implemented. A convention could say the values are passed by assigning them to members post-construction and not as constructor arguments, so you have one less point of contention between the object that is initialized and the object that does the initialization.

Last but not least, conventions are usually based on experience of things that could go wrong if certain behavior is allowed. A typical example is the requirement to escape all strings as they are being output — eliminating any ambiguities as to whether the string has already been escaped elsewhere and should be output as-is: it has not.

Zend comes with a variety of useful conventions enforced both through the interface of its tools — this is how you use a view, this is how you define a view helper that should be available from within any view, this is how you bind a piece of code to an URL, and so on. I happen to disagree with many of those conventions myself — because I believe they solve the wrong problems — but they are certainly better than a project with no conventions.

For the reference, my PHP conventions are described in the user manual for Ohm.

Framework

A framework is actually going a step further than mere conventions. They are super-conventions designed to be respected by plugin authors. The point is that if plugin A and plugin B respect the set of conventions provided by the framework, then they can be used together in the same application.

Consider a practical example : a plugin that implements a CAPTCHA field in a form, and a plugin that displays and submits a form through AJAX. On a bad day, it goes like this :

  1. When an error occurs, the server-side AJAX-form plugin sends out a small piece of JSON containing the fields that have errors, along with the error messages. A small client-side script applies these.
  2. However, the CAPTCHA plugin expected the image to be reloaded when an error occurs.  It may either keep the same image and target word — defeating the purpose of a CAPTCHA — or change the target word without knowing that the image could not be changed.
  3. You then need to post on StackOverflow hoping for a solution, search online for a patch to either plugin that could make it work as expected, or try to read the code to either in order to create the patch yourself.

Had the framework provided a clean notion of « this field must be refreshed on every attempt » as part of their form interface, both plugins would have used it — the CAPTCHA plugin would have marked its field as such, and the AJAX plugin would have implemented a special case for such fields.

As such, the purpose of a framework is to provide a clean, unambigous and extensive vocabulary that all the plugins should be able to speak, and that is designed to cover as much real-world situations as possible.

Zend Framework and Symfony in particular do an absolutely great job of this. When you can have a pager component push its data to the page through a progressive enhancement component, and log its performance to FirePHP when an user authentication component  determines that the viewing user is a developer, and all of it works by plugging square pegs into square holes, you know there has been a lot of great work going on below the hood.

Back to the point

Using a framework is all fun and games until you need to disagree with it. You need to plug out what does not work, and plug your own implementation in its place. The more complex the vocabulary, and the harder it will be to write new code — frameworks make it easy to connect existing components, at the cost of having to deal with more concepts when actually implementing new things.

What it boils down to, in the end, is whether you expect to be reusing a lot of third party components, or to write a lot of your own code. In the latter case, MicroPHP — and lean environments that do not have a heavy framework side to them — is actually an improvement over trying to fit a six-inch wooden square peg into a mini-USB port.

The exception to this is, of course, being so familiar with a particular framework that you immediately know what changes you need to do without fighting against third party code.

On Escaping HTML

A common issue with web software is cross-site scripting attacks — the ability for a third party to inject HTML elements into pages displayed to other users, using scripts contained in those elements to capture user cookies or perform operations on their behalf.

The technical challenge in solving this is that whenever data is being output through a HTML page, it should be escaped — any special HTML characters should be turned into their non-special versions in order to be displayed verbatim. This is an ongoing effort: each new page and each new variable on a page involve the same amount of effort to be done.

Of course, the solution would be to decide that escaping string output should be a default behavior that must be explicitly overriden. This does create issues where HTML is escaped when it should not have been, but:

  • These issues cannot be used to perform attacks.
  • They are usually easier to reproduce and consequently to solve.
  • HTML usually comes from template files, which can be handled with a different default.

Indeed, I can guarantee that my software has zero vulnerabilities related to escaped HTML, because I have built into the type system the fact that HTML always comes from templates, and the method that injects variables into templates escapes them. If I try to use a string as if it were HTML, I get a compiler error.

Even without a type system, one can guarantee that the system would rather break at runtime than allow an injection, using the exact same design, with incompatible data structures for templates and strings that blow up when a string is used as a template:

class FilledTemplate {
  function __construct($html) {
    $this->_html = $html;
  }
  function html() {
    return $this->_html;
  }
}

class Template {
  function __construct($file) {
    $this->_template = file_get_contents($file);
  }
  function fill($values) {
    $replace = array();
    $with    = array();
    foreach ($values as $key => $value) {
      $replace[] = '{'.$key.'}';
      if ($value instanceof FilledTemplate) 
        $with[] = $value->html();
      else 
        $with[] = htmlspecialchars($value);
    }
    return new FilledTemplate(
      str_replace($replace,$with,$this->_template)
    );
  } 
}

Obviously, many languages and frameworks use non-escaped string output as the default behavior. This, in my opinion, is pure, broken insanity — I can certainly see that designing a safe way of constructing HTML is harder than just following the «HTML is strings, just use string functions» approach and telling the programmer to «always escape your variables, kid» but I still find it quite irresponsible for self-proclaimed Web Languages to rely on such a primitive and dangerous paradigm. The stupid kind of irresponsible. Yes, PHP, I’m looking at you.

Article Image © Freedom II Andres — Flickr

Ohm – Least Resistance

A few astute readers have noticed that there was a new section, titled Ohm, in the blog header above. They have written in private to ask me, and were shortly included in the private beta testing phase for the project. That phase has now come to an end, as I now reveal the project to begin its open beta testing phase.

ohm-70x70Ohm – Least Resistance is a lightweight PHP 5.2 framework designed to be as simple as possible. You can use Ohm before your morning cup of coffee. If you need to do something clever or unusual, you can read the Ohm source code to find out how you can bend it to your needs: it’s only 2,000 lines, comments included. And Ohm conveys this simplicity to your own code, if you are willing to accept its philosophy:

1. Most frameworks can be extended because there are configuration options for every single piece of behavior. Ohm can be extended because it’s extremely simple.

2. Most frameworks help the programmer write code faster, often pushing the dynamic nature of PHP beyond its safe boundaries. Ohm recognizes that reading code is harder than writing it, and helps the programmer write code that is easy to read later on.

3. Most frameworks try to become repositories for countless pieces of useful but highly specific functionality. Ohm concentrates on being a HTTP/PHP/MySQL framework.

The Ohm framework is open source. In fact, its code is in the public domain, though I appreciate references and/or links back to the project page. What you get by downloading the framework:

  • Class auto-loader : finds your classes based on their name, so you don’t have to use require and include all over the place.
  • Request dispatcher : responds to HTTP requests by loading the appropriate action class and executing its response method.
  • Model-View-Controller : the framework layout follows the MVC pattern and helps your application do so as well.
  • Reusable Layouts : lets you define generic page templates that will be wrapped around the actual template. You can insert JS/CSS into the layout from the content.
  • Simple Forms : a backbone module for creating, processing and drawing HTML forms that you can extend to fit your needs.
  • Database Layer : a dead-simple extension on top of mysqli that lets you have an easier time writing SQL requests and retrieving the data, without having to learn a new query language.

If you have any questions about the framework, you can contact me directly (victor-ohm@nicollet.net). Any comments and feedback are welcome (this is an open beta, after all), either by mail or as comments on this post.

Enjoy!

Using JavaScript and PHP for Web Development

Today, I bring you a guest post from James Mowery.

When you want a website to do more than just display the same text to every user, you’re moving into the realm of things that will require the use of a programming language. Two of the most popular web development languages are PHP and JavaScript. They can often accomplish the same basic functions, though they’re designed to do it in different ways.

The most critical difference to consider is where the code will be executing. PHP is handled by the server, while JavaScript is processed by the user’s browser. This may initially seem like a minor implementation detail, but it actually has important implications.

The way that it executes means that JavaScript requires that your user be on a browser that supports JavaScript. It must also have this functionality turned on; some users turn off JavaScript support in their browser to improve their computer’s security. On the other hand, JavaScript can execute entirely on the user’s computer without having to contact the server again.

PHP, in contrast, has to contact the server every time a calculation or action needs to be performed. This means that communication between the user and the site will be much more frequent, more bandwidth will be used, and more requests will have to be processed. The need to wait for communication with the server also means that the time between getting input and giving a response is longer. On the other hand, PHP can achieve more sophisticated effects, including generating HTML and JavaScript as needed, and accessing databases.

Both JavaScript and PHP can be useful in setting up a website. Which one is more appropriate for a given situation will depend on the details of a project, and how responsive the servers running it are expected to be. These are issues that should be taken into consideration before any actual code is written.

About the author: James Mowery is a computer geek that writes about technology and related topics. If you wish to read more blog posts by him, he contributes regularly to a blog on laptop computers.

If you wish to contribute a guest post, you can send me an e-mail : victor@nicollet.net

Autoloading : be friendly to intruders

Back in the old days of PHP 4, every script started with a shopping list of include statements for other files.

PHP 5 brought along the __autoload function, and people were overjoyed. Since most programmers already had some kind of mental rule that said « class Foo is defined in Foo.php, » PHP let those programmers write down the rule and then followed it when looking for classes that had not been defined yet. A simple example would be:

function __autoload($classname)
{
  @include "$classname.php";
}

The classic PHP 5 architecture moved from « write a shopping list at the top of every script » to « include the file that defines __autoload » and even « redirect all requests to a single index.php file that defines __autoload and dispatches the requests. »

And the tutorialosphere went wild. People everywhere discovered the power of autoloading and expounded on the usage of __autoload as the next step in PHP evolution. A Bing search for __autoload (or google) will bring up many such one-page tutorials that discuss the benefits of that function for the sake of wide adoption.

But meanwhile, __autoload’s little sister spl_autoload_register remained unknown, despite a major difference:

If there must be multiple autoload functions, spl_autoload_register() allows for this. It effectively creates a queue of autoload functions, and runs through each of them in the order they are defined. By contrast, __autoload() may only be defined once.

With __autoload, your code breaks if you ever need to interact with code that uses its own autoloading approach. While you can usually turn __autoload into spl_autoload_register in a few key presses, you might not have sufficient control over the code to make that change.

joomla

Case in point: Joomla! is a well-known content management system (often said to be the third of the Drupal-Wordpress-Joomla! triumvirate of PHP CMS solutions). Since version 1.5, it uses __autoload. It looks like this:

function __autoload($class)
{
  if(JLoader::load($class)) {
    return true;
  }
  return false;
}

If you need to make Joomla! and the Zend Framework talk to each other, you need to include Zend Framework files by hand because you can’t add Zend_Loader on top of __autoload.  While it would be possible to change Joomla! to use spl_autoload_register instead of __autoload, this change will probably be overwritten by the next update you download.

In short, if you write code that will be used by people who do not own it (in the sense that they can change it without annoying side-effects), you need to use spl_autoload_register() instead of __autoload().

In the case of Joomla!, a simple patch would be to remove the __autoload() function definition and replace it with:

spl_autoload_register(array('JLoader','load'));

(In fact, there has already been one such suggestion made there).

Related posts

  • PHP Autoloading : yes, I made that mistake once, too
  • Pervasive code : an unusual class-to-file mapping in JITBrain
  • Singletons : having a single autoloader carries the typical issues of one-instance-only entities

Why I Gave Up on the Zend Framework

The Zend Framework is a really nifty thing. Really, it is. The amount of functionality that you get merely by installing it is extremely exciting: internationalization, forms, an MVC layout for your program, a cute class loader, a database abstraction layer, a templating engine, a request dispatcher, mail-sending functions, pretty debugging “dump” functions… and there are so many people working on it and using it that basically all the bugs left in there are shallow. It has been a staple dependency of many of my projects for quite a while now, and still is.

Zend Framework is actually available for your projects in two flavors, «use what you need» and «obey the hive mind», with a continuous spectrum in-between these two extremes.

We are Zend. Resistance is futile.

We are Zend. Resistance is futile.

The «use what you need» approach leaves the maintenance programmers with a warm and fuzzy feeling. All you have to do is dump all the framework files somewhere in your include path, include the files for the bits you want to use, and just call the functions. The framework takes care of recursively including the appropriate dependencies for you and carefully avoids treading on any toes by prefixing everything with «Zend».

In fact, if you use Zend_Loader, you can skip the include-source-files step completely (except for Zend/Loader.php obviously), and since auto-loading is reverse-compatible with loading files manually, it’s also a good step towards a well-deserved refactoring.

So, if you need to send multi-part mail, with HTML-and-text content, in UTF-8 format, you can just use Zend_Mail and everything will work fine regardless of the rest of the code base. There are dozens of such small features (for PDF generation, LDAP, access control, localization, and so on).

There is virtually no excuse for not using a plug-in class from the Zend Framework in your application if it solves the problem you’re having. Besides, since the files are not included until you need them, the worst that could happen is that you’re having some PHP code taking up a few megabytes of disk storage for nothing. So I have a lib/Zend directory on all my projects, just in case I need something.

Obey the Hive Mind

While many pieces of Zend are independent of each other, there’s a central functionality core that’s designed to act well together. There are many examples:

  • it’s easier to use Zend_Dispatcher and Zend_Controller together.
  • it’s easier to render a Zend_View if you’re also using Zend_Controller.
  • it’s easier to turn a Zend_Form into HTML if you’re using Zend_View.
  • it’s easier to set up a “login already in use” validator with Zend_Form if you have a field in a Zend_Db_Table to connect it to.
  • it’s easier to translate Zend_Form error messages with Zend_Translate (and Zend_Registry).

Sure, it’s usually possible to take advantage of 99% of the functionality without having to add new dependencies, but there’s always that tiny voice in the back of your head, nagging that you could get that additional 1% so easily if you just gave in.

Giving in means, of course, going all the way to Bootstrap heaven: now your project is laid out across the lines of the ideal Zend Framework template, your files cleanly stashed in their folders with a cosmic Feng Shui feeling to it all, and the Zend approach to MVC pervades your every HTTP request.

This isn’t so bad: actually, such an approach has some huge selling points for shops that write lots of small projects, such as the ability to get 20% of your basic functionality up and running in days, the ability to hire any Zend-certified developer and not have to educate them about the framework, and you don’t need them lousy architects on your team.

I’ve had some trouble with the Zend way before, though. There are some bits of functionality that I won’t touch with a ten-foot-pole, such as Zend_View, Zend_Controller or Zend_Db_Table, because the havoc they wreak in situations I find myself in outweighs the benefit.

Documentation

My main issue is that I find Zend quite lacking on the documentation side.

«But the Zend Framework is possibly the most documented there is!» you say, before trailing off in a rant about how the “FM” should be “R” and the “FW” should be “S”.

You’re probably right. But I don’t really care about that documentation. I’m talking about project documentation—to know what happens in code written by my team.

«What does Zend have to do with that? Document your code, you lazy slob!»

Humans are lazy, and I would argue that laziness is actually an essential quality of a good programmer. I can require that documentation be written, but I expect it to be missing, inaccurate or monosyllabic. Things like that happen when you’re rushing out a bug patch at 3:00 am. And even if I could ensure that documentation is written and kept up to date, I’d rather have my code be self-documented—not only does it take less time, but it’s harder to get inaccurate self-documentation and you can even get the language to check things for you.

It’s the difference between documenting the parameter type as a @param MyClass $obj in a comment and documenting it as a MyClass $obj type hint in the function signature.

Look at the average .phtml template, and you’ll see something like this:

<div>
  ...
  <a href="<?php echo $this->getUrl() ?>"><?php
    echo $this->escape($this->user->name)
  ?></a>
  ...
  <?echo $this->partial('preferences.phtml', $this->pref); ?>
  ...
</div>

Half the point of a view in the MVC approach is that I should be able to easily reuse that view from any controller, or even from within another view. Of course, Zend lets me do this very easily:

$view = new Zend_View();
$view->xxx = yyy; // Fill in members
$view->render('template.phtml');

The red line, of course, is where trouble begins. Since Zend_View fields are by definition dynamic, there’s no way to get auto-completion to help you find what they should be. Nor can you look at a list of these fields in a class definition or function definition, because there’s none. You have to read the template file and find out by yourself what values are used by the template and what their types should be. Oh, and if the template passes some of that data to other templates, you have to read those templates too, because they might use specific information. And you have to look at view helpers too, because they might be accessing view elements behind your back.

Your best bet is to look at an existing controller that uses the view, and hope that you don’t stray too far from what that controller is doing. You never know: a certain member might be expected to be present if another has a certain value (this never happened with the first controller, but it happens in yours), there’s no compiler checking that all values are being provided appropriately, and runtime testing doesn’t reveal such special cases on the first try.

And they say Zend_View is an object-oriented approach to rendering…

The most important aspect of Zend_View templating is that it is object oriented. You may use absolutely any value type in a template: arrays, scalars, objects and even PHP resources. There is no intermediary tag system between you and the full power of PHP. Part of this OOP approach is that all templates are effectively executed within the variable scope of the current Zend_View instance. To explain this consider the following template.

That’s not what object-oriented means. OOP means if two views behave differently, then they should be instances of different classes, instead of injecting arbitrary code and data into a single class and spitting in the face of encapsulation.

The bottom line is that reusing Zend_View templates is a pain in the derrière unless you take special steps about it (steps that you wouldn’t need with a standard class-with-members).

What’s in that row?

This is futher compounded by the way Zend_Db works: an ORM that generates SQL from a sequence of PHP calls, and then turns the result into a list of Zend_Db_Table_Row objects. Which leads to the question of what fields can be found in a given row, and that question is hard to answer.

A typical application will follow a rule along the lines of «every table row is, by definition, a row of a table, so you just peek at the table definition and you know that each column is mapped to a field,» and that is a fine rule to follow, because then the only issue is you can’t type-hint the row based on the table, so you can’t make sure a given argument is always a row from “account”.

But following that rule is hard. In addition to those 80% plain old CRUD cases where you’re working with a single table at once, you’ll have those 20% that use joins where you need data from both tables (never mind the pain of doing that in PHP). Then you end up with a row that breaks the rule, so you keep it in tightly enclosed areas of your application, until it gets too frustrating not being able to use a view-that-renders-accounts on a record-that-contains-accounts-and-sessions, and the next thing you remember is that you don’t know if a given view expects an account or an account-and-session.

And the language can’t help you.

Auto-complete me

Nor can your editor, for that matter, since auto-completing $row-> requires knowledge that your editor simply cannot have (the list of columns defined when you configured your Zend_Db_Table).

I really do enjoy it when my code editor helps eliminate some of the tedium of writing code. In fact, I’m quite ready to make a small additional effort tagging my members, arguments and functions with some type information just so that writing code can be easier.

My editor is Eclipse PDT. It has several nice features that I use extensively.

The first is, of course, its ability to suggest members of classes and objects. Having well-defined classes to represent your data means that Eclipse can use the type hints you leave around to determine that $account is of class Account, so that it has a $firstname member. That’s:

  1. one less round-trip to the database documentation
  2. zero chances of typing $account->firstName by mistake
  3. being told immediately if $account has entirely different members (because it’s another type)

Since Zend_Db_Table_Row and Zend_View actually go out of their way to make sure that you can have arbitrary data in there based on runtime considerations, getting this functionality out of them is impossible.

The other nice feature I use a lot is the ability to control-click a class or function to see its definition. This lets me navigate around the code in seconds instead of having to open the project file explorer, expand several layers of directories usually far from each other, and spend precious brain power translating a class/member naming scheme into file naming schemes.

Finding a file is a job for the editor, not for the programmer.

My view helpers look like this:

View_Account::renderSimple($account);

Clicking on that function name brings up the file and scrolls it down to where it matters. Took me less than a second. Zend View Helpers look like this:

$this->renderSimpleAccount($account);

I dare anyone to navigate to the definition of that helper in less than a second. [EDIT: apparently I shouldn't dare people on the internets :) ]

What about links? The typical approach to generating a link to a different part of a site, with the Zend Framework, is to spell out its controller and action:

<a href="<?php echo $this->url(array(
  'controller' => 'user',
  'action' => 'edit',
  'id' => '123'
));?>">click me!</a>

Now you have to click on every single URL on your website to make sure links are correct and you still manage to forget one and the end user will click on that link that’s spelled out as «edti». And even if you do get it right, you still have to navigate to the appropriate controller class, open it up and scroll down the right action.

My urls look like this:

<a href="<?=Action_User_Edit::url(123)?>">click me!</a>

Since every one of my actions is a class (as opposed to a function in a controller class), they get to have members, and one of these members is a static url() function that:

  • lets me ctrl-click through to the action itself
  • has PHP check that my link is correct (or else, die with a class-not-found answer)
  • documents the expected URL arguments as function arguments
  • even lets me find out who links to a certain controller, in case I have to move it

The bottom line…

…is that I don’t use Zend_View, Zend_Controller or Zend_Db in my projects. I need my code to be self-documenting, and there’s nothing self-documenting about Zend_View or Zend_Db. I need my code to be easy to navigate through and simple enough for my editor to handle, and the full dynamic behavior of Zend_View and Zend_Db prevent that.

Your needs might be different. Are they?

PHP 5.3 Closures as Block Literals?

I explained earlier a few things about writing reusable CSS code, and how it interacted with PHP. Let’s start with this basic HTML for generating two columns, with the right one being flexible and resizing to fill all available space:

<div class="col2">
  <div class="col2-l">
    [Content of left column]
  </div>
  <div class="col2-r">
    <div class="col2-ri">
      [Content of right column]
    </div>
  </div>
  <div class="clearer"></div>
</div>

.col2    { }
.col2-l  { float: left ; padding: 0 ; margin: 0 ; width: 120px }
.col2-r  { padding: 0 0 0 120px ; margin: 0 ; width:auto }
.col2-ri { float: left ; width : 100% }
.clearer { clear: both }

Elementary PHP

How does this translate to PHP? Basically as a series of constants (plus documentation detailing what the column HTML should look like):

class CSS_Col2
{
  const ROOT = "col2";
  const LEFT = "col2-l";
  const RIGHT = "col2-r";
  const RIGHT_INNER = "col2-ri";
}

This serves both as documentation for the existence of this component, and as an entry in the auto-completion tool to avoid typing incorrect classes by mistake. However, you still have to get the actual code right:

<div class="<?=CSS_Col2::ROOT?>">
  <div class="<?=CSS_Col2::LEFT?>">
    [20 lines of left column here]
  </div>
  <div class="<?=CSS_Col2::RIGHT?>">
    <div class="<?=CSS_Col2::RIGHT_INNER?>">
      [40 lines of right column here]
    </div>
  </div>
  <?CSS::CLEARER?>
</div>

Did I write everything correctly? Did I forget or misplace a clearer? Forgetting about the inner container in the right column is an easy mistake, and you won’t notice it until you put a clearing element in that column. And if your script is long enough, you won’t be able to see which opening tag matches which closing tag. Surely there must be a way to improve this.

Using HTML constants

A possibility is using constants to contain the relevant HTML:

class CSS_Col2
{
  // as above ...
  const _BEGIN_LEFT = '<div class="col2"><div class="col2-l">';
  const _BEGIN_RIGHT = '</div><div class="col2-r"><div class="col2-ri">';
  const _END = '</div></div><div class="clearer"></div>';
}

This makes code shorter, and you can’t mismatch or misplace tags as easily:

<?=CSS_Col2::_BEGIN_LEFT?>
  [20 lines of left column here]
<?=CSS_Col2::_BEGIN_RIGHT?>
  [40 lines of right column here]
<?=CSS_Col2::_END?>

However, all benefits of a nice and clean HTML editor are lost, because HTML constants don’t react as code, and there is therefore no validation performed. At least Eclipse could detect mismatching open/closing tags on raw HTML. Now, if you forget to “_END” your columns, your life is pain.

Using helpers

A common technique is to use a helper function for such rendering tasks. The function accepts some arguments that let it configure what ought to be displayed, then renders the wrapper HTML and inserts the data. Staying within the previous code:

class View_Helper_Col2
{
  static function Render($left, $right)
  {
    ?>
<div class="<?=CSS_Col2::ROOT?>">
  <div class="<?=CSS_Col2::LEFT?>">
    <?php call_user_func($left); ?>
  </div>
  <div class="<?=CSS_Col2::RIGHT?>">
    <div class="<?=CSS_Col2::RIGHT_INNER?>">
      <?php call_user_func($right); ?>
    </div>
  </div>
  <?CSS::CLEARER?>
</div><?
  }
}

I used callbacks here to do the rendering, because they are the most versatile (it sure beats having to instantiate a “renderable” class for each column). This approach provides the obvious benefit that now the entire rendering is taken care of by a single function, so there is no risk of forgetting or misplacing a tag, and the auto-completion tool can now help check which arguments are provided and in what order.

Still, this means that one should create two functions to render the two columns, and that any necessary data should be made available to them (due to the absence of closures in PHP < 5.3, this often means calling a member function of a view object containing the appropriate data). In the Zend Framework, for instance, one would just write two helpers, and provide them as callbacks knowing that they will have access to the data of the current view:

<?php
  $this->render2col(
    array($this,'myLeftCol'),
    array($this,'myRightCol')
  );
?>

Of course, it’s questionable whether moving a three-line for-each loop to a helper of its own actually increases the readability of the code. If defining a new class for every view, there’s the possibility of defining the columns as member functions within that same class, but it’s still somewhat awkward.

Helpers and Closures

PHP 5.3 introduces closures and optional arguments. This means that one can now write the behavior inline:

<?php
 $self = &$this;
 $this->render2col(
   function() use($self)
   {
     ?><h1><?=esc($self->user->name)?></h1><?php
   },
   function() use($self)
   {
     ?><ul><?php foreach ($self->items as $item): ?>
       <li><?php $self->render($item); ?></li>
     <?php endforeach; ?></ul><?php
   }
 );

However, making those functions inline creates a new issue: its not so obvious anymore what exactly a function is doing (because it’s too far away from the original call to the helper function). This can be solved by using a command pattern (while simultaneously noticing that one can get rid of the use keyword by providing $self as an argument (the helper does that):

<?php
 View_Col2::start($this)

 ->left(function($view){
   ?><h1><?=esc($view->user->name)?></h1><?php
 })

 ->right(function($view){
   ?><ul><?php foreach ($view->items as $item): ?>
     <li><?php $view->render($item); ?></li>
   <?php endforeach; ?></ul><?php
 })

 ->render();

Labels are now clearly mentioned, allowing empty lines to be inserted to separate the columns without forgetting what they are, so that the code looks cleaner overall.

chain()

Like many other languages, PHP is home to method chaining, a pattern that allows writing several mutators on the same object without having to name it more than once. A typical example can be found in the Zend Framework for configuration of e-mails, among other things :

$mail = new Zend_Mail();
$mail -> setBodyText('This is the text of the mail.')
      -> setFrom('somebody@example.com', 'Some Sender')
      -> addTo('somebody_else@example.com', 'Some Recipient')
      -> setSubject('TestSubject');

This is a very simple trick that is accomplished by having every mutator return the object itself.
However, the PHP syntax rules forbid calling a member function on the result of a new-expression, so that you always require a two-step sequence: initialize the object, then call its chain of mutators.

Of course, a simple solution is to use a function:

 function chain($obj) { return $obj; }

 $mail = chain(new Zend_Mail())
   -> setBodyText('This is the text of the mail.')
   -> setFrom('somebody@example.com', 'Some Sender')
   -> addTo('somebody_else@example.com', 'Some Recipient')
   -> setSubject('TestSubject');

In a similar vein, there’s the matter of using the method chaining pattern on objects that were not designed for that. This is where a quick wrapper can come in handy:

 // Define the appropriate class and function
 class WithWrapper
 {
   public $value;
   public function __construct($obj) {
     $this -> value = $obj;
   }
   public function __call($name, $args) {
     assert (count($args) === 1);
     $this -> value -> $name = $args[0];
     return $this;
   }
 }

 function with($obj) {
   return new WithWrapper($obj);
 }

 // A typical record class
 class Person
 {
   var $age;
   var $firstName;
   var $lastName;
   var $married;
 }

 // Create entry for Jane
 $jane = with(new Person())
   -> age(24)
   -> firstName("Jane")
   -> lastName("Smith")
   -> married(false)
   -> value;

 // Jane gets married
 with($jane)
   -> lastName("Brown")
   -> married(false);

This is starting to look like Visual Basic

Left to the reader

PHP best practices have been moving steadily towards putting all functions inside classes, if only to provide namespacing. The good news is that you have no more namespace collision issues (well, unless you join together two projects with different conventions), and the bad news is that your function names are starting to get quite long.

<?php echo Framework_Html::Escape($username); ?>

Escaping strings to be output in HTML documents is a quite common behavior in PHP websites. Is the risk of a name collision worth giving up on a shorter approach, like:

<?=esc($username)?>

I am a proponent of turning very common operations into short functions with appropriate “smart” behavior. For instance:

  • esc(string $string) returns a Framework_Html instance representing the string escaped with htmlspecialchars.
  • esc(Framework_Html $html) returns its argument as-is, so you don’t have to care about whether a given string has already been escaped or not.
  • esc($format, $a, $b, $c...) returns a Framework_Html instance representing the unescaped string sprintf($format, esc($a), esc($b), esc($c)), useful to avoid repeated escaping in, say, <a href="%s">%s<a/> .

In a similar vein:

  • func(callback $call) returns its argument (after checking that is_callable($call) is true). This serves as a piece of documentation to tell that something is a function.
  • func(object $obj, string $func) returns a callback representing the member function $func of object $obj.
  • func(string $class, string $func) returns a callback representing the static member function $func of class $class.
  • func(string $args, string $body) acts as a shorter alias for create_function.
  • func(string $body) acts as an alias for create_function('$_',"return $body;"), in those cases you need a very short lambda expression.

And of course, there’s the jslog() and is() functions discussed earlier on the blog.

I think there would be a small handful of functions, maybe 8 or 10, that would be used so often on a given project that everyone would have to know about them anyway—so, you might as well keep them out of any class.

PHP Type Checking

PHP does not enforce types at compile-time (if anything, because there isn’t a compile time) and runtime checking only happens at the leaves of your source code tree, when you use a PHP function and that function notices one of its arguments is incorrect.

There are of course ways of introducing additional type safety into PHP code, both through development practices and through hints. For instance, you can hard-code checks into function prologues:

function SetUsername($username, $usr_id)
{
  assert (is_string($username));
  assert (is_int($usr_id));
  // ...
}

And, if using class types, you can also use the type hint mechanism in PHP 5 to get automatic warnings:

function FitToWindow(Image $img, Window $window)
{
  // ...
}

There remains the issue of member variables, which are modified and read in many different places. This means a “check the object is in a valid state” function is an useful addition to a class, to be used as a validity check during development to catch any errors as soon as they occur.

I sometimes use the following for my checks:

class Type
{
 public static function Is($value, $type)
 {
   if (func_num_args() > 2) {
     $args = func_get_args();
     array_shift($args);
     return self::Is($value, $args);
   }

   if (is_string($type))
     return self::Is($value, array_filter(explode(' ', $type)));

   if (empty($type))
     return true;

   $first = array_shift($type);

   if ($first == 'null')
     return $value === null || self::Is($value, $type);

   if ($first == 'array') {
     if (!is_array($value))
       return false;
     $next = 0;
     foreach ($value as $key => $val) {
       if ($key != $next++)
         return false;
       if (!self::Is($val, $type))
         return false;
     }
     return true;
   }

   if ($first == 'time')
     return is_int($value) && $value >= 0;

   if ($first == 'hash') {
     if (!is_array($value))
       return false;
     foreach ($value as $val)
       if (!is($val, $type))
         return false;
     return true;
   }

   if (is_callable($first))
     return call_user_func($first, $value) && self::Is($value, $type);

   if (is_callable('is_' . $first))
     return call_user_func('is_' . $first, $value) && self::Is($value, $type);

   if (class_exists($first))
     return($value instanceof $first);

   return false;
 }

 public function checkTypes()
 {
   self::check($this);
 }

 public static function check($obj)
 {
   $class = get_class($obj);
   foreach (get_class_vars($class) as $var => $value)
     if ($var{0} != '_')
       if (!is($obj->$var, $value))
         throw new Exception("Type error: `$class::$var` is not of type `$value`");
  }
}

The typical use is to define a new class, then assign a default value to all type-checked variables: that default value is a type string (or array) that is parsed and verified by the check functions. For instance:

class User
{
  var $id = 'int';
  var $name = 'null string';
  var $media = 'array Media';
  var $friends = 'positive int';
  var $_hash;
}

This would check that the identifier is an integer, that the name is a string or null, that media is an array of instances of the Media class, and that friends is an integer such that is_positive($obj->friends) returns true (assuming you define that function somewhere). The hash variable is unchecked because it starts with an underscore. This has some advantages:

  • Type expressions are shorter than the corresponding assert statements.
  • They go deeper as far as checks go (for instance, arrays also check that all members are of a certain type).
  • They document the code, by explaining in the class definition what the types of the variables are, as opposed to staying in a function.
  • They help with automated testing by allowing the creation of classes with arbitrary values of the chosen type.

This also has disadvantages:

  • This prevents setting an actual default value for the variables.
  • It introduces an artificial naming convention for variables starting with or without underscores.
  • Type-checking arrays or large structures takes time.
  • It’s not detected by documentation generators.
  • Does not play well with private variables.


1170 feed subscribers
(readers who polled a feed this week)