Subject: [xsl] stylesheet vs egrep From: Ahmad J Reeves <ahmad@xxxxxxxxxxxxxx> Date: Fri, 25 Jan 2002 11:35:49 +0000 |
Hi there, I have xml files that contain 4 types of tags, direct,local,global and admin in varying numbers..e.g. <LOG> <DIRECT> <COMMUNICATION_TYPE> PAGETELL </COMMUNICATION_TYPE> <Invoc_serial> 27 </Invoc_serial> <Serial> 3087908 </Serial> <USAGE> TELL </USAGE> <MESSAGE_TYPE> EMOTE </MESSAGE_TYPE> <CHARACTER_ID> 44639 </CHARACTER_ID> <CHARACTER_STATUS> 3 </CHARACTER_STATUS> <LOCATION_ID> 45040 </LOCATION_ID> <TARGET_CHARACTER_ID> 23470 </TARGET_CHARACTER_ID> <TARGET_CHARACTER_STATUS> 6 </TARGET_CHARACTER_STATUS> <TARGET_CHARACTER_LOCATION_ID> 23222 </TARGET_CHARACTER_LOCATION_ID> <MESSAGE> hello </MESSAGE> <TIME> 'Mon, 26 Nov 2001 15:40:29 +0000' </TIME> </DIRECT> <LOCAL> <COMMUNICATION_TYPE> SAY </COMMUNICATION_TYPE> <Invoc_serial> 27 </Invoc_serial> <Serial> 3089121 </Serial> <CHARACTER_ID> 6477 </CHARACTER_ID> <CHARACTER_STATUS> 6 </CHARACTER_STATUS> <LOCATION_ID> 1002 </LOCATION_ID> <MESSAGE> uh huh </MESSAGE> <TIME> 'Mon, 26 Nov 2001 15:43:34 +0000' </TIME> </LOCAL> <GLOBAL> <COMMUNICATION_TYPE> YELL </COMMUNICATION_TYPE> <Invoc_serial> 27 </Invoc_serial> <Serial> 3106303 </Serial> <CHARACTER_ID> 47350 </CHARACTER_ID> <CHARACTER_STATUS> 4 </CHARACTER_STATUS> <LOCATION_ID> 32037 </LOCATION_ID> <MESSAGE> Well </MESSAGE> <TIME> 'Mon, 26 Nov 2001 16:28:25 +0000' </TIME> </GLOBAL> <ADMIN> <COMMUNICATION_TYPE> NAT </COMMUNICATION_TYPE> <Invoc_serial> 27 </Invoc_serial> <Serial> 3228413 </Serial> <MESSAGE_TYPE> EMOTE </MESSAGE_TYPE> <CHARACTER_ID> 9980 </CHARACTER_ID> <CHARACTER_STATUS> 3 </CHARACTER_STATUS> <LOCATION_ID> 293 </LOCATION_ID> <TARGET_CHARACTER_ID> 11457 </TARGET_CHARACTER_ID> <TARGET_CHARACTER_STATUS> 1 </TARGET_CHARACTER_STATUS> <TARGET_CHARACTER_LOCATION_ID> 23595 </TARGET_CHARACTER_LOCATION_ID> <MESSAGE> Yes </MESSAGE> <TIME> 'Mon, 26 Nov 2001 21:43:26 +0000' </TIME> </ADMIN> </LOG> I need to get a list of all the character_id's, and then remove the duplicates and count them. With the following stylesheet, <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:output method="text"/> <xsl:variable name="NL" select="'
'"/> <xsl:template match="LOG"> <xsl:apply-templates select="DIRECT"/> <xsl:apply-templates select="LOCAL"/> <xsl:apply-templates select="GLOBAL"/> <xsl:apply-templates select="ADMIN"/> </xsl:template> <xsl:template match="DIRECT"> <xsl:apply-templates select="CHARACTER_ID"/> </xsl:template> <xsl:template match="LOCAL"> <xsl:apply-templates select="CHARACTER_ID"/> </xsl:template> <xsl:template match="GLOBAL"> <xsl:apply-templates select="CHARACTER_ID"/> </xsl:template> <xsl:template match="ADMIN"> <xsl:apply-templates select="CHARACTER_ID"/> </xsl:template> <xsl:template match="CHARACTER_ID"> <xsl:value-of select="."/> <xsl:value-of select="$NL"/> </xsl:template> </xsl:stylesheet> I get 28,793 character_id's, when sorted and duplicates removed (using sort -u in Unix) it equals 165. To check this I use egrep as follows: - egrep "<CHARACTER_ID> [0-9]{3,6} </CHARACTER_ID>" 1.xml | wc -l which also gives me 28,793. so far so good. But when I sort these and remove duplicates I get 254! There are numbers in the xslt file that arn't even in the egrep file, even though it gives less numbers. Is it my stylesheet thats lying, or my egrep ? Thanks Ahmad. XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
RE: [xsl] id function doesn't work , Michael Kay | Thread | Re: [xsl] stylesheet vs egrep, Trevor Nash |
[xsl] id function doesn't work in c, Cornelia Stratulat | Date | Re: [xsl] id function doesn't work , Trevor Nash |
Month |