SQL Generation, Part 1

As a complementary solution to generating PHP views is the generation of PHP models in a Model-View-Controller approach. In this example, the following model is used:

  • A user has a first name, a last name, and a password (stored as a hash).
  • A user may have at most one session, which contains an optional IP and an optional session token.
  • A user has a list of favorites (stored as URLs).

To represent these constraints, the Objective Caml code below is used:

let structure  =
  empty
  >> create_entity "user"
    [ "firstname", "VARCHAR(255) NOT NULL" ;
      "lastname", "VARCHAR(255) NOT NULL" ;
      "passhash", "VARCHAR(64) NOT NULL" ]
  >> add_extension "user" "session"
    [ "ip", "VARCHAR(16) NULL" ;
      "token", "VARCHAR(64) NULL" ]
  >> add_list_field "user" "favorites" "VARCHAR(255)"

Note: here, the >> operator is used to mean x >> f == f(x). An extension defines an entity as being tied to another entity with optional semantics: the user entity may have at most one session entity, and a given session always has exactly one associated user. A list field associates several elementary values of a certain type with a single entity in a classic one-to-many relationship. The above structure is automatically transformed by a compiler function into this SQL code:

CREATE TABLE `user` (
  `id` INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
  `passhash` VARCHAR(64) NOT NULL,
  `lastname` VARCHAR(255) NOT NULL,
  `firstname` VARCHAR(255) NOT NULL
); 

CREATE TABLE `session` (
  `token` VARCHAR(64) NULL,
  `ip` VARCHAR(16) NULL,
  `user` INT NOT NULL UNIQUE REFERENCES `user`(`id`)
); 

CREATE TABLE `user_favorites` (
  `value` VARCHAR(255),
  `user` INT NOT NULL REFERENCES `user`(`id`)
);

Design and rationale

The underlying data structure used by the generator is divided into three main categories:

  • Entities are the core concept, they are uniquely identified by a name. In themselves, they carry no information, except for an identifier which can be used to distinguish two entities of the same type from one another (for instance, differentiating two users using their key). Entities can have properties and can be related to each other by bindings.
  • Properties add a basic piece of information (an integer, a string or something close) to an entity. All entities of a given type have that property.
  • Bindings tie together two types of entities. A binding incorporates the notion of arity, meaning that it may relate zero or one, exactly one, or many entities of a type with zero or one, exactly one, or many entities of another type. A binding may also be normalized : either it is made part of the first entity type (meaning the arity for the other type cannot be “many”), or it is made part of the second type (meaning the arity for the first type cannot be “man”), or it is turned into a standalone relationship which gets its own table.

This structure differs from the typical “table with columns” approach that is prevalent in SQL in particular and in relational algebra in general, though it is perfectly equivalent to the latter : while an entity-property-binding representation can be turned into a table-column representation by making every entity and binding a table, and sticking properties as columns of the entity tables, the reverse transform is possible:

  • A table is an entity.
  • The non-reference columns of a table are properties of that table’s entity.
  • The reference columns of a table are bindings of the appropriate arity.

For the purposes of generating code, however, the entity-property-binding method is superior because it separates properties from bindings : while a property affects only the entity to which it is associated, a binding affects two entities. For instance, the unique identifier for an entity is not generated if that entity is an extension (that is, it is bound to exactly one other entity of another type, and is not subject to any other bindings). Also, because of normalization being a configurable option, it’s possible to decide whether the binding is a relationship or a column.

The entire data structure is kept in a single object, which acts as an associative container where new entities, properties and bindings may be added. The implementation of simple add_entity, add_property, add_binding functions allow the creation of more elaborate data structure generators:

let create_entity e l s =
  s >> add_entity e
    >> add_fields e l

let add_extension e n l s =
  s >> add_entity n
    >> add_binding (`One e) (`Option n) e `B
    >> add_fields n l  

let add_list_field e n k s =
  let n = e ^ "_" ^ n in
  s >> add_entity n
    >> add_fields n ["value",k]
    >> add_binding (`One e) (`Many n) e `B

The `B argument means that the binding should be inserted into the right member of the relationship.

Implementation

A minimally documented implementation is available for download here : files/articles/sql-gen-1.ml

Design patterns

As with PHP views, Objective Caml can now serve as a meta-language, useful for describing processes that lead to the creation of data structures, and in particular of typical database design patterns. For instance, consider a simple rights management system, where a table maps object/user pairs with a set of rights (and a different such table exists for every kind of objects that can have rights). Adding rights management is then done as:

let add_rights users objects rights s =
  let id = objects ^ "_rights" in
  s >> create_entity id
    [ "rights" , "set(" ^ String.concat "," rights ^ ") NOT NULL" ]
    >> add_binding (`Many id) (`One users) "user" `A
    >> add_binding (`Many id) (`One objects) objects `A 

let structure =
  empty
  >> add_entity "user"
  >> add_entity "page"
  >> add_rights "user" "page" [ "read" ; "write" ]

Similarly, one could define a tree with the nodes referencing data from a given entity type:

let add_binary_tree_of data name s =
  s >> add_entity name
    >> add_binding (`Many name) (`One data) "data" `A
    >> add_binding (`One name) (`Option name) "left" `A
    >> add_binding (`One name) (`Option name) "right" `A 

let structure =
  empty
  >> add_entity "node"
  >> add_binary_tree_of "node" "tree"

Further work

Now that the underlying data structure itself can be generated automatically, the operations (either in a high-level language or directly as SQL stored procedures) appears as a logical next step.

0 Responses to “SQL Generation, Part 1”


  1. No Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>



680 feed subscribers
(readers who polled a feed this week)