RE: [xsl] retain/rebuild unparsed entities in internal subset

Subject: RE: [xsl] retain/rebuild unparsed entities in internal subset
From: "Michael Kay" <mike@xxxxxxxxxxxx>
Date: Tue, 22 Apr 2008 13:03:46 +0100
There's nothing in XSLT (or in Saxon) to make this information available.
You could try writing your own SAX filter between the XML parser and the
XSLT processor, which could make this information available by translating
it into elements in some special namespace, for example.

Michael Kay
http://www.saxonica.com/ 

> -----Original Message-----
> From: Graswinckel_Ewout@xxxxxxx [mailto:Graswinckel_Ewout@xxxxxxx] 
> Sent: 22 April 2008 12:34
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Subject: [xsl] retain/rebuild unparsed entities in internal subset
> 
> Hi,
> 
> I have a large XML document that I want to split up in 
> smaller documents using xslt:
> 
> <!DOCTYPE root [
>   <!NOTATION cgm PUBLIC 
>     "-//USA-DOD//NOTATION Computer Graphics Metafile//EN"          >
>   <!NOTATION ccitt4 PUBLIC 
>     "-//USA-DOD//NOTATION CCITT Group4 Facsimile//EN"              >
>    
> 	<!ENTITY image1 SYSTEM "image1.cgm" NDATA cgm>
> 	<!ENTITY image2 SYSTEM "image2.tiff" NDATA ccitt4> ]> <root>
>   <section>
>     <img src="image1"/>
>   </section>
>   <section>
>     <img src="image2"/>
>   </section>
> </root>
> 
> Each <section> should become a separate document. The 
> splitting itself is no problem, but the issue I have is that 
> the internal subset is lost when doing an xsl transformation.
> 
> Ideally I'd like to rebuild the internal subset for each of 
> the generated documents with only the <!ENTITY> items that 
> are used in that section.
> 
> So far I've found that I can use <xsl:text 
> disable-output-escaping="yes"/> with a CDATA section inside 
> to output the doctype declaration (or maybe I can use the 
> saxon:doctype thing, haven't looked at that yet). And I can 
> get the location of the file using the 
> unparsed-entity-uri(..) xpath function. 
> 
> What I cannot access yet is the NDATA type (e.g: 'cgm') and 
> the public id associated with that type ("USA-DOD//NOTATION 
> Computer Graphics
> Metafile//EN")
> 
> Is there any way I can access the NDATA and NOTATION 
> declarations using xslt? I'm using xslt2 with saxon. 
> Preferably I'd like to use a standard way, but if that's not 
> possible something that only works on saxon will have to do.
> 
> Thanks,
> Ewout

Current Thread