Subject: Re: [xsl] rendering marginal XML From: Peter Flynn <peter@xxxxxxxxxxx> Date: Fri, 02 Nov 2001 15:24:08 +0000 |
Jay Kline wrote: > I run a list server that generates its logs in XML format. > It appears to be valid XML,
(or "evilly-formed" as a colleague of mine refers to this type of stuff :-). "Clumbersome" is an excellent description. It looks as if it was designed by someone who had heard XML described, but had never seen any before.
It uses this form: > > <msgSent> > <time>time sent</time> > <origin>me@xxxxxxxx</origin> > <r>you@xxxxxxxxx</r> > <recieved>time recieved</recieved> > <status>Any error messages, etc</status> > <r>you2@xxxxxxxxx</r> > <recieved>time recieved</recieved> > <status>Any error messages, etc</status> > (this repeats for each recipient) > </msgSent> > (this repeats for each message) > > The problem is the <recieved> and <status> tags refer to the > imediately preceding <r> tag. I would like to generate a list > from these logs that contains only email addresses that had errors.
The first thing I do with defective designs like this is rationalise the file so I can work with it. In this case it is simple to pass it through sgmlnorm (part of James Clark's SP) pretending it is SGML, so you can force the addition of a new element type to enclose r, recieved [do they really spell it like that?] and status.
where sgml-spec.dec is (in my test) the old DocBook SGML Declaration with GENERAL YES changed to GENERAL NO, and a trivial DTD:
<!ELEMENT msgSent - - (time,origin,trace+)> <!ELEMENT trace O O (r,recieved,status?)> <!ELEMENT (time,origin,r,recieved,status) - - (#PCDATA)>
(assuming status is optional and is only present where there has been an error). The result is
<msgSent> <time>time sent</time> <origin>me@xxxxxxxx</origin> <trace> <r>you@xxxxxxxxx</r> <recieved>time recieved</recieved> <status>Any error messages, etc</status> </trace> <trace> <r>you2@xxxxxxxxx</r> <recieved>time recieved</recieved> <status>Any error messages, etc</status> </trace> </msgSent>
(indents courtesy of xxml.el). Now you can test in XPath for the presence or absence of "trace/status".
That took about a minute to write and another minute to test. It makes a lot of assumptions about the non-use of declared or undeclared entities, other element types and constraints you may have omitted for brevity, etc. It might in your case be simpler just to run it through sed or some other editing process to do the same job. Some people will also consider it overkill: your call But it's a fine example of an XML structure designed without forethought or foreknowledge: thanks for sharing it.
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] FO question, reg. display, MURAKAMI Shinyu | Thread | Re: [xsl] key definition, Peter Flynn |
[xsl] FO question, reg. display-ali, Scherpenzeel, Wim | Date | Re: [xsl] key definition, Peter Flynn |
Month |