Output Buffering

A program has an object, which it needs to convert to a string. On the one hand, converting to a string is often more efficient by appending the individual sub-strings to an output channel, because the output channel is optimized precisely for that (it allocates a large memory buffer which is gradually filled) whereas string concatenation cannot reuse the memory space of one operand nor over-allocate memory for it. On the other hand, writing the string to an output channel means there will be no means of retrieving that string, because it has been output and therefore left the program.

This has led, in many languages, to the development of a channel-like object that outputs to a string instead of sending out the data. Java has java.lang.StringBuilder, C++ has std::stringstream and .NET has System.IO.StringWriter. In this regard, PHP takes a quite original route.

PHP has always heavily relied on inline tags to construct output. This means that unlike Java,  C++ or .NET programs, a typical PHP script will be regularly interrupting its normal execution process to include some raw text that is immediately output back to the user:

Hello, <?=$_SERVER['REMOTE_ADDR']?>!

A keyboard under orange light.Where other languages usually include text as string literals (or text loaded from external configuration files) and output these by manually sending them to a stream:

std::cout << "Hello, "
          << getenv("REMOTE_ADDR")
          << "!";

Explicit output means that it’s possible to replace std::cout with a custom string stream (or a polymorphic ostream parameter) to create a string from the output instead of directly sending it to standard output. Implicit output means there’s no “I’m writing this to standard output” indicator to replace. Sure, PHP does support the standard “use literals and print-to-string” option, using the traditional echo, print and printf functions, and it’s possible to write a program without ever using a single line of inline HTML, but it’s not an encouraged practice: quite to the contrary, Zend_View actually relies on inline HTML with its phtml files.

The PHP solution is known as output buffering. Instead of allowing additional output channels in addition to the normal “standard output” global channel, PHP allows replacing the global channel with another. This is done by using ob_start(), and is ended by either ob_get_clean() or ob_end_flush(). The former returns whatever was output since the global channel was replaced as a string, the latter outputs it to the previously active global channel. Both then restore the previously active buffer, which makes it possible to create a stack of temporary replacements of the global output channel.

This has several interesting properties. The first is that now, it is preferable to print out any kind of data either using print/printf or using inline HTML, depending on the nature of what is being output, instead of manually concatenating a string, returning it through several layers of functions, and finally printing it at some point:

<?php 

ob_start();
foreach( $data as $row )
  printf("<li>%s</li>", $row );  
printf("<ul>%s</li>", ob_get_clean());

Output buffering can also be used to capture inline HTML, of course:

<? ob_start(); ?>Hello, <?=$_SERVER['REMOTE_ADDR']?>!<? print md5(ob_get_clean()); ?>

Another interesting consequence is that now, it’s possible to prevent any output from being sent before the headers have all ben generated.This is because, by default, PHP streams data to the user as soon as it becomes available, and the HTTP response headers are, by definition, to be sent before everything else, which means the first time PHP detects an output it will send all the headers registered so far, then ignore any further headers with a warning. By starting the very first script with an ob_start(), all output is buffered until ob_end_flush() is manually called, which allows all headers to be completed beforehand without risk of accidental sending.

Any function which usually prints out data directly (such as fpassthru()) can also be captured by this.

Last but not least, output buffering allows output filtering: an optional argument to ob_start can be a function used to process the data before it’s output: when the output is extracted (sent to the previous output, or as a string) the function is called on the data and the return value is used instead. So, the above example is equivalent to :

<? ob_start('md5'); ?>Hello, <?=$_SERVER['REMOTE_ADDR']?>!

Note that all buffers are flushed in order when the script ends, so there’s no risk of forgetting to flush a buffer and having no output.

More information about output buffering is available in the PHP manual.

And until next time, Joyeux Noël.

0 Responses to “Output Buffering”


  1. No Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>



1170 feed subscribers
(readers who polled a feed this week)