The premises
In the last installment of Functional Tuesdays, it was mentioned that Objective Caml could be used to write a PHP view that was verified by the generator program as creating valid XHTML. While many tasks will have to be programmed directly in Objective Caml to perform the validation, some of the more elementary parts of validation can be done directly in the type system of the language.
Omitting a lot of details (attributes, unwanted tags, self-closing tags, comments…), the grammar of XHTML looks like:
text : any piece of text with quoted entities p : <p>(text|span)*</p> span : <span>(text|span)*</span> div : <div>(p|span|div|ul|ol|text)*</div> li : <li>(p|span|div|ul|ol|text)*</li> ul : <ul>li*</ul> ol : <ol>li*</ol> body : <body>(p|div|ul|ol)*</body> html : <html>body</html>
It’s already possible to write one function for each of these non-terminals, and this has already been done. Formatting the things a little bit better:
module TagImpl = struct let tag n i = Printf.sprintf "<%s>%s</%s>" n (String.concat "" i) n let p i = tag "p" i and li i = tag "li" i and ul i = tag "ul" i and ol i = tag "ol" i and div i = tag "div" i and span i = tag "span" i and body i = tag "body" i and html i = tag "html" i and text i = i type 'a block = string let to_string s = s end
Module types
The main selling point of Objective Caml as a functional language is its excellent module and type system. In this case, the above module is type-unsafe (in the area of generating a valid XHTML document) because it manipulates all nodes as strings. This makes it possible to create a document such as <html><p>Hello!</p></html> (with html [p [text "Hello!"]]). To make this module type-safe, one can apply a module type, thereby restraining the signature of the functions in the module to have a specific type. This is the purpose of the two additional definitions of ‘a block (all functions will take as arguments lists of blocks and return blocks) and to_string (to return a string representing a block).
The type parameter of the block type serves to convey information about what the block is. Since Objective Caml supports variant types, it’s possible to use a variant type as that type parameter, with a distinct variant label representing each of the tags. For instance, a <p> tag will be represented by the label `P, and any functions which accept a paragraph tag as their internal XHTML code will take `P as a possible label.
The result is this module type:
module type TAG = sig type 'a block val text : string -> [>`TEXT] block val p : [<`TEXT|`SPAN] block list -> [>`P] block val li : [<`SPAN|`TEXT|`UL|`OL|`P |`DIV] block list -> [>`LI] block val ul : [<`LI] block list -> [>`UL] block val ol : [<`LI] block list -> [>`OL] block val div : [<`SPAN|`TEXT|`UL|`OL|`P |`DIV] block list -> [>`DIV] block val span : [<`TEXT|`SPAN] block list -> [>`SPAN] block val body : [<`DIV|`P|`UL|`LI] block list -> [>`BODY] block val html : [<`BODY] block list -> [>`HTML] block val to_string : 'a block -> string end
The type of the blocks accepted in the “inner XHTML” list of each tag is restricted to be one of the possible labels, while the returned block type is constrained to contain at least the type of block created by that tag. The returned block is not exactly the type of created block because this would prevent placing the returned blocks in lists of distinct blocks (such as a block containing both text and spans).
Error detection
The basic errors are detected by this type system:
module Tag = (TagImpl : TAG);; open Tag;; let _ = html [text "Bad!"];; (* ^^^^^^^^^^^ This expression has type [> `TEXT ] Tag.block but is here used with type [ `BODY ] Tag.block *) let _ = p [ ul [] ];; (* ^^^^^ This expression has type [> `UL ] Tag.block but is here used with type [< `SPAN | `TEXT ] Tag.block *) let _ = ul [ div [] ];; (* ^^^^^^ This expression has type [> `DIV ] Tag.block but is here used with type [ `LI ] Tag.block *)
Additionally, the errors are fairly clear: the current type of the block being insertedis shown first, followed by the type of blocks that are allowed in that position.
As an advanced feature, this also plays well with the Objective Caml type system, allowing the user to define functions which create subexpressions from other subexpressions and still having everything type-safe. For instance, a function which takes a list of blocks and turns it into an XHTML list with one block per <li></li>.
let decorate elts = ul (List.map (fun x -> li [x]) elts);; (* val decorate : [< `DIV | `P | `SPAN | `TEXT ] Tag.block list -> [> `UL ] Tag.block = <fun> *) let _ = decorate [text "Hello"; div [text "world!"]; body []];; (* ^^^^^^^ This expression has type [> `BODY ] Tag.block but is here used with type [< `DIV | `P | `SPAN | `TEXT > `DIV `TEXT ] Tag.block *)
The error message is slightly less clear here because of the > `DIV `TEXT section (indicating that the list already contains div tags and text sections). Another example would be a function which decides whether to wrap some text in a <p> or a <div>, the output of such a function cannot be used safely in a <p>:
let choose elts = if Random.int 2 = 1 then p elts else div elts;; (* val choose : [< `SPAN | `TEXT ] Tag.block list -> [> `DIV | `P ] Tag.block = <fun> *) let _ = choose [div []];; (* ^^^^^^ This expression has type [> `DIV ] Tag.block but is here used with type [< `SPAN | `TEXT ] Tag.block *) let _ = p [choose []];; (* ^^^^^^^^^ This expression has type [> `DIV | `P ] Tag.block but is here used with type [< `SPAN | `TEXT ] Tag.block *)
Even better, it can detect ahead of time that some functions cannot be used, for instance a function which decides whether to wrap in <p> (accepts only text or spans) or in <ul> (accepts only list elements):
let impossible elts = if Random.int 2 = 1 then ul elts else p elts;; (* ^^^^ This expression has type [ `LI ] Tag.block list but is here used with type [< `SPAN | `TEXT ] Tag.block list These two variant types have no intersection *)
Further work
Now that the XHTML tree itself is valid, it is interesting to consider inline PHP code as being typed blocks (based on the type they can output), which in turn leads to safe generation of a valid document witin the PHP code.
Hi. I'm Victor Nicollet,
Recent Comments