Lazy Objects | POST Requests

The elementary HTTP Protocol (a clear case of the RAS Syndrome) is stateless. This means, mostly, that there is no persistent open connection open between your browser and the web server, and there is no direct way to simulate that connection by keeping some state from one visit to the next.

The login process is a pretty classic example of this: you enter your data in a login form, but enter the bad password. The data is sent as POST to the server, which notices it’s incorrect, and needs to display the login page again with your login still present in the field so you don’t have ton re-enter it, along with an “invalid password” message and a “forgot your password?” link. How can you do that?

  • Redirect the form to the same page, and have that page either redirect to another on success or re-display the same form with the existing data (which it knows, since it received it from the POST request). While this is possible in this example, quite often it isn’t practical—what if you have two or more forms on the page ? What if you want to display a message upon success on the target page, such as “message posted” ?
  • POST data to a certain page, then pass all the necessary data as part of the GET arguments of the redirect. This can get ugly, because your failed login attempt redirects you to the ugly /login?login_mail=foo@bar.com instead of just /login. And since you can’t redirect through a POST, you would have to respond with a page that uses Javascript to perform that POST again, which would indeed work (you end up on /login, with the appropriate data) but means that the “back” button does not work and the “refresh” button gives out a warning.
  • Use cookies. A cookie is a piece of persistent data used to simulate a client-server persistent dialogue. So, you send a POST to the server, the server replies with a redirect and login_mail=foo@bar.com inside a cookie, you follow the redirect and send the cookie along, which the server reads and uses.

In the olden days, people just appended the data to the GET query and went on with it. Of course, it wasn’t raw data: instead, the server stored the data in persistent storage and associated it with a “session identifier” that was sent through GET arguments. All your URLs had a weird-looking ?sid=afc134b72fc4551de suffix but the users kind of expected it anyway.

When cookies became increasingly popular, the session mechanism of storing persistent data on the server remained the same for practical reasons (less bandwith, less cookie storage requirements on the client, and the client cannot tamper with the session data), but the session identifier was sent as a cookie instead of a GET argument. This is the current model, and GET arguments are sometimes used in extreme situations where cookies do not work:

  • When cookies are disabled. This happens in some web browsers, or is sometimes explicitly disabled by users. It also happens in automated retrieval tools that can follow links but not remember cookies, such as search engine crawlers.
  • When moving from a domain to another. Since cookies do not work in a cross-domain fashion, if you wish to use a session across domains, you have to use a GET or POST argument (usually a GET, as it is more versatile).
exclamationAn important architecture decision to be made is to determine whether non-cookie users will be supported by the system or not. This decision is important because, if you do decide to support them, every single internal link and redirect on your website will have to take the session ID into account (plus, it makes handling of session-less users harder, since you have to detect whether they can use cookies or not). Sure, PHP can alleviate this burden by automatically adding the session ID to every relative URL, but not URLs are relative (and, in our case, none are).

My decision is that support for cookie-less users brings a tiny advantage but requires non-trivial work to be implemented on a reasonable scale, so I will not implement it. If you are interested about this subject, make sure you read the PHP documentation.

PHP provides a session mechanism. The basic idea is that you start by calling session_start(), which detects if a session identifier is present and loads a $_SESSION global array from the session data corresponding to that identifier (or an empty array if none). It stores any modifications you made to the $_SESSION array back in the persistent storage so you can access them on the next run, and it also sends the user a cookie with the session identifier. All you need to know is that whenever this user requests a page, your persistent $_SESSION array is there to tell you what happened in previous requests and tell future requests about things.

In fact, you need to know some more things:

  • The lifetime of a session has a default of 180 minutes (three hours) that you can change through configuration files and functions. However, since sessions use server memory, it is ill advised to keep them for too long, especially if most of your visitors never come back.
  • The user can stop using a certain session at will by clearing his cookies. Any security measure you take by writing to the session can be circumvented easily. So, you cannot ban an user or avoid denial-of-service attacks by setting a flag in the session data!
  • Session data is safe. While the malevolent user can write anything they want to the $_GET, $_POST and $_COOKIE variables (and, to a lesser extent, $_FILES and $_SERVER) they can never add values of their own to $_SESSION. Every time you read a value for a session, you can be certain that it has not been tampered with. Even better: since only the session identifier is sent to the user, you can store critical data in the session store (storing such data in GET or POST arguments or in cookies would have been a security risk).
  • Sessions are tied to the web server, not the database. If you have multiple web servers hidden behind a load balancer, you have to make sure that a given visitor is always handled by the same server, or the session data will become mysteriously lost between requests as the second request is not served by the same server as the first.
  • Since sessions are stored by the server, the help reduce the load on your database. Whenever you have per-user persistent data that doesn’t have to persist if the user gets disconnects, you can store it in the session store to improve performance.
  • Sessions are a global array. While you can use named sessions, you easily run out of names if you are not careful. Namespacing is a classic approach to avoding this problem.

My namespacing strategy in this program is simple: every controller gets to store its own persistent data, and uses its own name as a namespace. So, controller “/login” storing a variable login_name in the context “errors” in the session data would use $_SESSION['login/errors:login_name']. Of course, a controller can also store a value in the namespace of another controller, if it wishes to communicate with it (so our do-login POST controller would use this approach to tell the login controller what error messages to display).

This leads to the development of a session utils class that handles namespacing, as well as another problem of the session system (which is that you have to check whether the session array contains a value before extracting it). It ends up looking like this:

<?php  // utils/session.php
  session_start();

  class SessionUtils
  {
      public static function Read($namespace, array $keys)
      {
          $data = array();
          foreach ($keys as $key) {
              $data[$key] = array_key_exists("$namespace:$key", $_SESSION)
                          ? $_SESSION["$namespace:$key"]
                          : null;
          }

          return $data;
      }

      public static function Write($namespace, array $data)
      {
          $data = array();
          foreach ($data as $key => $value) {
              $_SESSION["$namespace:$key"] = $value;
          }
      }

      public static function Delete($namespace, array $keys)
      {
          foreach ($keys as $key) {
              unset($_SESSION["$namespace:$key"]);
          }
      }

      public static function ReadDelete($namespace, array $keys)
      {
          $data = self::Read($namespace, $keys);
          self::Delete($namespace, $keys);
          return $data;
      }
  }

Here, SessionUtils::Read() returns an array containing the values found at the named keys in the provided namespace, using null for every key that doesn’t exist (the point of this being, you get an array that always has the specified keys present, so you don’t get errors). The SessionUtils::Write() function performs the inverse operation, and the SessionUtils::Delete() function removes values from the session to save memory or avoid unnecessary repeat displays (such as error messages). SessionUtils::ReadDelete() reads and deletes values, which is useful for one-shot messages (such as those on the login page, which must only be displayed once).

The session only starts when this class is used, which means users only get a new session when they actually need one. This is a minor optimization for a system that requires users to be logged in :)

All of this being said, the controller for the login page ultimately looks like this:

<?php // controllers/login.php

  $values = SessionUtils::ReadDelete('login/values',
    array(
      'login_mail', 'signup_mail', 'signup_name'
    ));

  $errors = SessionUtils::ReadDelete('login/errors',
    array(
      'login_mail', 'login_pass', 'signup_mail',
      'signup_name', 'signup_pass', 'signup_pass2'
    ));

  LoginPageView::Render($values, $errors);
?>

Short and concise, since it’s mostly a display-only controller.

Lazy Objects | POST Requests

0 Responses to “8. Sessions”


  1. No Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>



1170 feed subscribers
(readers who polled a feed this week)