XSL to XHTML

Last month, I have discussed using XSLT to generate XHTML from a fixed XML representation of a page, in this article. However, due to the fickle nature of computers, several practical arise when applying a perfectly good theory.

Verbatim inclusion and XHTML namespaces

Naive implementation of copying XHTML tags from the input XML (for instance, the content of a page as entered by a user) is usually incorrect. XSL does provide a recursive copy process, <xsl:copy-of select=”xpath”/>, but this copy is so good that it will actually copy the namespace of the original nodes. If this namespace is not the usual xhtml namespace, then it will add xmlns=”" attributes to all copied nodes that appear as part of an XHTML document.

The namespace of any XHTML element that has to be copied as XHTML should be the xhtml namespace. This is usually done by defining the xhtml namespace in the root element of the original XML document, for instance with:

<document xmlns:xhtml="http://www.w3.org/1999/xhtml">

And then setting the correct namespace for every element that has to be copied:

<xhtml:div><xhtml:img src="/pic.jpg" alt="The Picture"/></xhtml:div>

This is understandably difficult to enforce, especially when the inner XHTML was input by the user. An alternative is to set a default namespace for a container of XHTML code. For instance:

<content xmlns="http://www.w3.org/1999/xhtml">
  <div><img src="/pic.jpg" alt="The Picture"/></div>
</content>

This allows wrapping the input XHTML without modifying it.

Of course, you still have to be careful to use valid XHTML as an input (otherwise, the XML parser will reject the document).

As a quick bonus snippet, a copy-everything helper, which performs a deep copy of every child (element or text) of the current node:

<xsl:template name="copy_everything">
  <xsl:copy-of select="*|text()"/>
</xsl:template>

Verbatim inclusion and Javascript

The problem with Javascript that is inlined in XHTML code (unlike, say, CSS) is that Javascript uses some characters which must be escaped in XML documents, such as < or &. The solution, of course, is to place the javascript code in a <![CDATA[...]]> section to avoid the issue altogether. This has the benefit of working correctly with an XSL stylesheet which will outpt the javascript code as-is.

Except that, since the XSL stylesheet is generating XHTML, it will escape the characters found in the javascript. No problem so far, since modern browsers have XML parsers which will turn the < and & back to their former selves. However, older browsers often can’t.

The typical solution used for backwards compatibility is to use a final XHTML file that looks like this:

<script type="text/javascript"/>
  /* <![CDATA[*/
  { javascript code }
  /* ]]> */
</script>

The XML parser in modern browsers ignores the javascript comments and generates /**/ { javascript code} /**/, which is valid. The older browsers expect Javascript inside the script tag, so they recognize the comments, and the CDATA tag is interpreted as a comment.

The problem is that applying the XSLT to the document will parse the CDATA segment, which removes the compatibility trick (but it should still work with browsers that have a good XML parser).

A partial solution comes from the ability in XSL to wrap the text content of an element in a CDATA segment. The input document can now contain verbatim script declarations, such as:

<verbatim-script>
  <![CDATA[alert('<hello>');]]>
</verbatim-script>

And the corresponding output will be:

<script type="text/javascript">
  /*<cdata><![CDATA[*/
  alert('<hello>');
  /*]]></cdata>*/
</script>

Which will have the same quote-and-comment structure and properties as the original solution. Implementation:

<xsl:output cdata-section-elements="cdata"/>
<xsl:template match="verbatim-script">
  <script type="text/javascript">
    /*<cdata>*/
    <xsl:value-of select="."/>
    /*</cdata>*/
  </script>
</xsl:template>

The downside with this solution is that there is now an element within the script tag, which is invalid in XHTML and therefore causes validation failures. If you really need to include a bunch of Javascript in your file, then this is probably the solution.

An improved and perfectly valid solution is to simply move off any javascript to external files, which are merely included by the client instead of being inlined. Depending on your dispatch architecture, it could be well possible to have the client look for a javascript by querying the same URL with a certain GET parameter, which triggers the use of a distinct XSL stylesheet that only extracts the script from an XML file and outputs it as a large Javascript file.

0 Responses to “XSL to XHTML”


  1. No Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>



1170 feed subscribers
(readers who polled a feed this week)