As a sequel to the previous installment of Dynamic Wednesdays, this article considers the next step in closure manipulation in PHP: persistence. Persistence is the process of preserving a part of program state from one execution to another. In the execution model of PHP, scripts are executed independently (and concurrently) and their working memory is flushed from memory when they end. The only preserved data is that which is sent back to the HTTP client, and that which is saved in the database.
This makes several typical design patterns quite impractical for any usage that extends beyond a single HTTP request: for instance, an API designer may wish the users of the API to be notified of certain events (for instance, the modification of a piece of data). In a typical non-transactional application, the user modules would register themselves with the API by use of the Observer design pattern (or, in more functional terms, a callback) which would remain stored in the API until the application shuts down, and the observer would be notified of the relevant events. This kind of behavior is, in terms of functionality, perfectly supported by PHP.
However, for this to work, every HTTP request must execute in a completely initialized environment. This requires the server to create a new instance of every core object, then load and initialize all objects provided by the third party users (thereby setting up the aggregation links required by the Observer design pattern through registration with the core objects), and finally run the HTTP request which, most of the time, will not trigger the observer at all. In short, using these techniques in PHP requires massive and mostly useless initialization times that would be best done without.
Partial workarounds do exist: after all, the only requirement to avoid initialization costs is to persist the inter-object relationships in the database. For instance, if a certain user wants to be notified when a certain product is back in stock, his user identifier would be associated with the product identifier in a “waiting for stock” table in a relational database. Observing that a user was associated to the product, the server would then load the user-related code, seek the user by its identifier, and proceed to send the notification. This partial solution, however, is incomplete, since it does not support the Open-Closed principle. Since neither the source code to be loaded (user-related, for instance) nor the relational database table allow for polymorphism, a third party developer cannot extend the product notification system to notify something other than a user.
Using a database
The above observations lead to a simple conclusion: to achieve robust persistence of closures, it is necessary to allow polymorphic behavior in terms of source file to be added, and in terms of code to be executed. This implies that the database will have to store both a list of source files to be loaded from disk when the persisted closure is called, and a description of what code should be called. This means either providing a single function name, or a piece of PHP code accompanied by a serialized list of the closure contents.
Despite the obvious danger of storing executable PHP code directly in a database, I will suggest going with that option as a preliminary step, possibly using a specific user for modifying closures, and giving all other users only the right to SELECT and DELETE from the closures table. SQL pseudocode for the database would be as follows:
CREATE TABLE `closures` ( `id` INT NOT NULL AUTO_INCREMENT, `exec` TEXT NOT NULL, -- PHP code to be run. PRIMARY KEY(`id`));-- To insert a new closure and get the coresponding ID: INSERT INTO `closures` (`exec`) VALUES ('@exec')-- To extract a closure's data SELECT `exec` FROM `closures` WHERE `id` = @id-- To drop an unused closure DELETE FROM `closures` WHERE `id` = @id
Creating a closure involves four arguments: the list of source files to include to be able to run the closure (this will be prepended to the PHP code stored in the database), the formal parameter, the source code to be executed, and the list of arguments stored in the closure (this will be serialized by value, so no references are allowed here). Again, in a pseudocode fashion:
function create_persistent_closure($include, $args, $code, $data) { foreach($include as $file) $src .= "require_once('$file');" foreach(array_values($args) as $id => $arg) $src .= sprintf('$%s = function_get_arg(%d);', $arg, $id); foreach($data as $var => $value) $src .= sprintf('$%s = unserialize(\'%s\');', $var, addcslashes(serialize($value), '\\\'')); return add_to_database($src . $code); } function get_persistent_closure($id) { return create_function('', get_from_database($id)); }
The return value of get_persistent_closure is a callable function which will execute the stored code. Note that due to the limitations of serialization in PHP, the list of arguments stored in the closure will generally be a set of identifiers and indices used to retrieve the actual objects to be manipulated (from factories and singletons). For instance:
// When registering an observer (first HTTP request) $closure = create_persistent_closure( array( 'core/models/user.php' ), '$product_id', 'UserFactory::Get($user_id)->SendProductNotification($product_id);', array('user_id' => $user_id)); Store::Get($store_id)->AddObserver($closure); // When notifying observers (second HTTP request) foreach ($this->closures as $id) { $func = get_persistent_closure($id); $func($product_id); drop_from_database($func); } $this->closures = array();
Using code generation
The above example, while full of good ideas, is insufficient. The main reason is, of course, safety: anyone with write access to the appropriate database table can cause arbitrary code to be executed, and this is easy if an SQL injection vulnerability exists. By contrast, storing the source code as files on disk is safer: write access to the disk already bears the risk of executing arbitrary code simply by uploading it. By restricting PHP-implemented uploads to a specific directory that is kept separate from the persistent closure cache, one can ensure the safety of the code.
By using source files serialized to disk, however, one runs the risk of multiple access to the same file. Therefore, it is advised that the file is manipulated using locking primitives to avoid those issues.
I will not provide the complete implementation here. However, the basic idea behind this implementation is to provide a directory containing the persistent closures. The closures are saved to relative paths within that directory, of the form ‘module/params/reason’, for instance ‘store/1013/onProductRestock’. The closures are stored inside the event handler that will be using them, thereby improving performance when several closures are used by the same handler (thus avoiding multiple loads).
The typical usage would be:
// When registering an observer (first HTTP request) Store::Get($store_id) -> GetHandler() -> Register( array( 'core/models/user.php' ), '$product_id', 'UserFactory::Get($user_id)->SendProductNotification($product_id);', array('user_id' => $user_id)); // When notifying observers (second HTTP request) $handler = $this -> GetHandler(); $handler -> Call($product_id); $handler -> Clear(); // Inside the GetHandler function return new EventHandler("store/" . $this->store_id . "/onProductRestock");
Hi. I'm Victor Nicollet,
0 Responses to “Persistent PHP Closures”