| Subject: [xsl] Complex splitting of XML tag to multiple other XML tags using XSL	T From: Lars Eskildsen <laes@xxxxxxxxx> Date: Sun, 20 Oct 2002 15:43:37 +0200 | 
Hello XSLT experts!!
We recieve XML files from one of our customers and then 
transform it into our own XML format using 
XSLT 1.0 (and Xalan 1.3), but we have a specific problem:
----------
We have the following DTD snippet (for the customer XML):
<!ELEMENT ADLIST (head, lines)>
<!ELEMENT head (#PCDATA)>
<!ELEMENT lines (TeleLine, InetLine)+>
<!ELEMENT TeleLine ( text1?, texte2? )>
<!ELEMENT InetLine (#PCDATA)>
<!ELEMENT text1 (#PCDATA)>
<!ELEMENT text2 (#PCDATA)>
In general we want to use XSLT to convert ONE <ADLIST> tag
to ONE <AD> tag, where our own DTD for the <AD> tag is 
the following:
<!ELEMENT AD (head?, lines)>
<!ATTLIST AD SEQ CDATA (U|S|M|E) #REQUIRED>
<!ELEMENT head (#PCDATA)>
<!ELEMENT lines (TeleLine, InetLine)+>
<!ELEMENT TeleLine ( text1?, texte2? )>
<!ELEMENT InetLine (#PCDATA)>
<!ELEMENT text1 (#PCDATA)>
<!ELEMENT text2 (#PCDATA)>
In doing the one-to-one conversion, we set the SEQ attribute 
to the value 'U' (undefined). 
The one-to-one conversion is NOT a problem!
----------
In certain circumstances we want to convert an <ADLIST> tag 
to several <AD> tags, using the SEQ attribute to reflect 
the sequence of the <AD> tags in relation to the 
original <ADLIST>.
The semantics of this atrribute is 'S' for Start, 
'M' for Middle and 'E' for End.
The rules for splitting the original <ADLIST> tag into 
several <AD> tags, is as follows:
1) The <ADLIST> tag must contain:
    a) more than one <TeleLine> tag and at least one 
       <InetLine> tag or
    b) more than one <InetLine> tag and at least one 
       <TeleLine> tag
2) The <ADLIST> tag MUST contain a <TeleLine> tag that 
   contains a <text1> tag and is NOT the first <TeleLine> 
   tag.
3) The <ADLIST> tag must be split at <TeleLine> tags that 
   contains an <text1> tag.
When doing the split, we have to obey the following:
i)   The first <AD> tag contains at LEAST one <TeleLine>
     and at LEAST one <TeleLine>, NOT more than one of both.
     Furthermore only the first <AD> tag contains the 
     <head> tag from the original XML and this <AD> tag 
     should have the SEQ attribute set to 'S'.
ii)  The last <AD> tag contains the LAST <TeleLine> tag with 
     a <text1> tag (and eventual <InetLine> and/or <TeleLine> 
     with NO <text1> tag that follows).
     The last <AD> tag should have the SEQ attribute set to 'E'.
iii) Medium <AD> tags (between the first and the last) should 
     be generated for each NOT LAST <TeleLine> tags that 
     contains a <text1> tag.
     These <AD> tags should have the SEQ attribute set to 'M'.
----------
Sometimes (maybe always) an example says more than a 
1000 specification words, so heres an example:
<ADLIST>
  <head>Head Text</head>
  <lines>
    <TeleLine>
       <text2>TTT1</text2>
    </TeleLine>
    <TeleLine>
       <text1>TTT2</text1>
    </TeleLine>
    <InetLine>III1</InetLine>
    <InetLine>III2</InetLine>
    <TeleLine>
       <text2>TTT3</text2>
    </TeleLine>
    <TeleLine>
       <text1>TTT4</text1>
    </TeleLine>
    <InetLine>III3</InetLine>
    <TeleLine>
      <text1>TTT5</text1>
    </TeleLine>
    <InetLine>III4</InetLine>
    <TeleLine>
      <text1>TTT6</text1>
    </TeleLine>
    <TeleLine>
      <text2>TTT7</text2>
    </TeleLine>
  </lines>
</ADLIST>
Should be converted to the following sequence of <AD> tags:
<AD SEQ="S">
  <head>Head Text</head>
  <lines>
    <TeleLine>
       <text2>TTT1</text2>
    </TeleLine>
    <TeleLine>
       <text1>TTT2</text1>
    </TeleLine>
    <InetLine>III1</InetLine>
  </lines>
</AD>
<AD SEQ="M">
  <lines>
    <InetLine>III2</InetLine>
    <TeleLine>
       <text2>TTT3</text2>
    </TeleLine>
    <TeleLine>
       <text1>TTT4</text1>
    </TeleLine>
  </lines>
</AD>
<AD SEQ="M">
  <lines>   
    <InetLine>III3</InetLine>
    <TeleLine>
      <text1>TTT5</text1>
    </TeleLine>
  </lines>
</AD>
<AD SEQ="E">
  <lines>  
    <InetLine>III4</InetLine>
    <TeleLine>
      <text1>TTT6</text1>
    </TeleLine>
    <TeleLine>
      <text2>TTT7</text2>
    </TeleLine>
  </lines>
</AD>
-------
I suppose the solution requires some elaborate use 
of the <xsl:key> tag, but i just cant seem to figure 
it out (believe me - i have tried)!
If anyone out there can help, i would REALLY appreciate 
it (and even buy that someone some excellent danish beer, 
if he or she should ever visit Aarhus in Denmark)!
/Lars
** Stibo Graphic          | Søren Nymarks Vej 21 | DK-8270 Højbjerg 
** mailto:laes@xxxxxxxxx  | http://www.stibographic.com 
** Phone:  +45 8939 8939  | Fax:    +45 8939 8940
** Direct: +45 8939 7421
 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
| Current Thread | 
|---|
| 
 | 
| <- Previous | Index | Next -> | 
|---|---|---|
| RE: [xsl] passing parameters betwee, Michael Kay | Thread | [xsl] Transposing matrices, Nickolay Kolev | 
| Re: [xsl] passing parameters betwee, Peter Lavender | Date | [xsl] Transposing matrices, Nickolay Kolev | 
| Month |