Subject: [xsl] Splitting an XML file based on size From: Adam Van Den Hoven <Adam.Hoven@xxxxxxxxxxxx> Date: Tue, 3 Apr 2001 15:50:04 -0700 |
Hey guys, I'm processing an NITF file into HTML. NITF is very much like HTML in that it has a body with paragraph tags that has mixed content. The HTML that I am creating from my tranforms can quickly become several tens of kb in size. Since I'm transfering this over a wireless modem to a PocketPC at a maximum of 14.4 kbs, an HTML file that is 15kb is entirely too big. I need some way to keep track of the number of characters I've processed and stop when I reach a specific size, stoping at the end of the paragraph. I understand that counting characters is not very precise but I am only interested in getting the transfer size to be less than 2K or so. As an example, I might have the following NITF code: <nitf baselang="en.ca"> <head><!-- Header Metadata here --></head> <body> <body.head><!-- Body head stuff here --></body.head> <body.content> <p> Lorem ipsum dolor sit amet, <em>consectetuer adipiscing elit, sed diem</em> nonummy nibh euismod tincidunt ut lacreet dolore magna aliguam erat volutpat. </p> <p> Lorem ipsum <q>dolor sit amet, consectetuer adipiscing elit,</q> sed diem nonummy nibh euismod tincidunt ut lacreet dolore magna aliguam erat volutpat. </p> <p> Lorem ipsum dolor sit amet, consectetuer adipiscing elit, sed diem <em>nonummy nibh euismod </em> tincidunt ut lacreet dolore magna aliguam erat volutpat. </p> <p> Lorem ipsum dolor sit amet, <em>consectetuer adipiscing elit, </em> sed diem nonummy nibh euismod tincidunt ut lacreet dolore magna aliguam erat volutpat. </p> <p> Lorem ipsum dolor sit amet, <q>consectetuer adipiscing elit,</q> sed diem nonummy nibh euismod tincidunt ut lacreet dolore magna aliguam erat volutpat. </p> <p>Lorem ipsum dolor sit amet, consectetuer adipiscing elit, sed diem nonummy nibh euismod tincidunt ut lacreet dolore magna aliguam erat volutpat. </p> </body.content> <body.end><!-- tagline here --></body.end> </body> </nitf> The text there happens to be nearly 500 characters. Lets say that my target size is 375 characters. That should be "o" in "euismod" in the third <p> tag. Normally I would create: <html> <head><!-- Header Metadata here --></head> <body> <p> Lorem ipsum dolor sit amet, <em>consectetuer adipiscing elit, sed diem</em> nonummy nibh euismod tincidunt ut lacreet dolore magna aliguam erat volutpat. </p> <p> Lorem ipsum <q>dolor sit amet, consectetuer adipiscing elit,</q> sed diem nonummy nibh euismod tincidunt ut lacreet dolore magna aliguam erat volutpat. </p> <p> Lorem ipsum dolor sit amet, consectetuer adipiscing elit, sed diem <em>nonummy nibh euismod </em> tincidunt ut lacreet dolore magna aliguam erat volutpat. </p> <p> Lorem ipsum dolor sit amet, <em>consectetuer adipiscing elit, </em> sed diem nonummy nibh euismod tincidunt ut lacreet dolore magna aliguam erat volutpat. </p> <p> Lorem ipsum dolor sit amet, <q>consectetuer adipiscing elit,</q> sed diem nonummy nibh euismod tincidunt ut lacreet dolore magna aliguam erat volutpat. </p> <p>Lorem ipsum dolor sit amet, consectetuer adipiscing elit, sed diem nonummy nibh euismod tincidunt ut lacreet dolore magna aliguam erat volutpat. </p> </body> </html> but what I want to create is: <html> <head><!-- Header Metadata here --></head> <body> <p> Lorem ipsum dolor sit amet, <em>consectetuer adipiscing elit, sed diem</em> nonummy nibh euismod tincidunt ut lacreet dolore magna aliguam erat volutpat. </p> <p> Lorem ipsum <q>dolor sit amet, consectetuer adipiscing elit,</q> sed diem nonummy nibh euismod tincidunt ut lacreet dolore magna aliguam erat volutpat. </p> <p> Lorem ipsum dolor sit amet, consectetuer adipiscing elit, sed diem <em>nonummy nibh euismod </em> tincidunt ut lacreet dolore magna aliguam erat volutpat. </p> <p><a href="someURL">View Entire story</a></p> </body> </html> > I can't be so coarse as counting paragraphs since I might also have a > table (essentially an HTML table) or lists or something. Some paragraphs > will be as short as a single sentance, others will be much longer. > > I also need to do some additional processing after I reach the end of the > NITF text (but the size of those will be much more rigid and simply > subtracted from the target filesize). > > I had thought about doing something approximately like: > > <xsl:template match="p" mode="block"> > <xsl:param name="cursize" select="0"> > <xsl:variable name="size" select="$cursize" /> > <p> > <xsl:apply-templates select="child::node()" mode="inline"> > <xsl:with-param name="cursize" select="$size + 7" /> > <!-- +7 characters for the tags --> > </xsl:apply-templates> > </p> > <xsl:if test="$size <= 400"> > <xsl:apply-templates match="followingsibling::p[1]" > mode="block"/> <xsl:with-param name="cursize" select="$size" </xsl:apply-templates> > </xsl:if> > </xsl:template> > > but clearly that isn't going to work. I also assume that making a global > variable called $size wouldn't work either. > > I am getting the feeling that this isn't strictly possible with XSL. I am > using MSXML 3 so scripting might be a solution but I am loath to use it > unless I have to. > > Adam van den Hoven > Internet Application Developer > Blue Zone > tel. 604.685.4310 > fax. 604.685.4391 > Blue Zone makes you interactive.(tm) http://www.bluezone.net/ > XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
[xsl] Request Off-Post ASP/IIS/XML/, Gollwitzer Brian Con | Thread | Re: [xsl] Splitting an XML file bas, dan mason |
thanks Re: [xsl] multi-level groupi, Dave Gomboc | Date | RE: [xsl] xsl to xsl, Michael Kay |
Month |