Monday, February 13, 2006

Escaping CDATA in CDATA

In a current project I needed to be able to embed user submitted text into an element in a XML document.

The obvious way to start would be to enclose the text in a <![CDATA[ ... ]]> element. However, because the text came from user input, I couldn't guarantee that it didn't contain the CDATA end tag ']]>' in the text.

The W3C specification just says that you can't nest CDATA elements, but it doesn't say how to escape an existing ']]>' sequence so that it doesn't break the enclosing CDATA element.

I eventually found this useful blog entry CDATA-Section-Delimitosis on CodeSnipers.com.

Basically, it says that if you want to include the text '<![CDATA[ ... ]]>' in CDATA element, you can leave the openning '<![CDATA[' as is, but you need to split the closing ']]>' accross two CDATA elements.

For example, the following block of text

.....
<![CDATA[ ... ]]>
....
should be placed in a two CDATA elements like this

<![CDATA[
....
<![CDATA[ ... ]]]]><![CDATA[>
....
]]>
Splitting the ']]>' delimiter into two parts ']]' and '>'.

0 Comments:

Post a Comment

<< Home