Subject: RE: [xsl] (text processing) lexical context From: "Michael Kay" <michael.h.kay@xxxxxxxxxxxx> Date: Wed, 24 Apr 2002 09:31:32 +0100 |
One other piece of advice (somewhat heretical for this list): XSLT is not the only tool in your kitbag. In fact, where you want to identify structure in the source that's not explicit in the markup, XSLT is often not the best tool for the job. You could probably tackle this one more easily by writing a SAX filter that inserts a <sentence> start tag immediately after <root>, a </sentence> end tag immediately before </root>, and a </sentence><sentence> pair immediately after a "." that's followed by whitespace. Michael Kay Software AG home: Michael.H.Kay@xxxxxxxxxxxx work: Michael.Kay@xxxxxxxxxxxxxx > -----Original Message----- > From: owner-xsl-list@xxxxxxxxxxxxxxxxxxxxxx > [mailto:owner-xsl-list@xxxxxxxxxxxxxxxxxxxxxx]On Behalf Of cutlass > Sent: 24 April 2002 09:04 > To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx > Subject: Re: [xsl] (text processing) lexical context > > > Hello Nicolas, > > ----- Original Message ----- > From: "Nicolas Mazziotta" <Nicolas.Mazziotta@xxxxxxxxx> > > > <root> > > This is the <w>first</w> <i>sentence</i>. This is the <w>second</w> > > <i>sentence</i>. This is the <w>third</w> <i>sentence</i>. > > </root> > > this particular form of markup keeps cropping up over and > over again, and i > suspect that most people will tell you that it is not so > good. The main > problem with this type of markup is that it tends to be rather open > ended....eg. there could be a variety of elements, nesting structures, > etc.... > > > <html> > > <ol> > > <li>first: This is the <b>first</b> <i>sentence</i>. > > <li>Second: This is the <w>second</b> <i>sentence</i>. > > <li>Third: This is the <b>third</b> <i>sentence</i>. > > </ol> > > </html> > > > > i am assuming u made an error with the opening <w> in second > sentance ? > > right so you want to > > a) tokenize each sentance > b) number with words ( i.e. First, Second, Third ) > c) copy all children elements within a sentance across > d) replace elements with other elements > > there are a few approaches; > > - you are doing too much in one transform, yes it is possible > to have one > large complicated transform, but why not break up into small > steps so u can > conceptualise > > - u can either tokenise each sentance by customising the > string tokenise > function ( many places, one of them being www.exslt.org ) and > tokenise each > sentance ( based upon finding a period ) > > - or i suspect this is a rather good use of Dimitre > Novatchev's functional > library at www.topxml.com > > both results will require a little investment in learning, > > the other stuff, like copying or replacing elements, > numbering with words > will come after you get over the first step. > > gl, jim fuller > > > > But I can't figure out how I can select the text surrounding the <w> > > element without using <xsl:value-of.../>, which does not allow me to > > process the following <i> element... > > > > i.e., I get > > > > <html> > > <ol> > > <li>first: This is the <b>first</b> sentence. > > <li>Second: This is the <w>second</b> sentence. > > <li>Third: This is the <b>third</b> sentence. > > </ol> > > </html> > > > > and the <i> element is lost... > > > > And I can't do <xsl template match="substring(...)"> > because substring > > is not a DOM node. > > > > Help: is there a way to process substrings or stg? > > > > N. Mazziotta > > > > > > XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list > XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] (text processing) lexical, cutlass | Thread | RE : [xsl] (text processing) lexica, Nicolas Mazziotta |
Re: [xsl] (text processing) lexical, cutlass | Date | [xsl] oracle table to html via xsl, Denis McCarthy |
Month |