If I try to parse an HTML document using JAXP/DOM and specify, via whatever means, a HTML4 dtd (i.e. http://www.w3.org/TR/html4/strict.dtd) it throws a fatal error. Catching that, I can extrapolate that "The declaration for the entity "ContentType" must end with '>'" on line 81 of the dtd.
Line 81 of the HTML4 strict dtd:
<!ENTITY % ContentType "CDATA"
-- media type, as per [RFC2045]
-->
Even though the Java w3c.dom API states that 'Document' can encapsulate an HTML document why does the parser not allow the lagacy style comments that seem to have been ruled out looking at the XML spec (at least, that appears to be the issue).
Does anyone know of a way this can be overcome?Originally Posted by "xml 1.0 spec third ed


LinkBack URL
About LinkBacks
Reply With Quote