Subject: RE: [xsl] segmenting a paragraph From: Wendell Piez <wapiez@xxxxxxxxxxxxxxxx> Date: Tue, 02 Oct 2007 10:18:28 -0400 |
When you need to apply regex matching to text that crosses node boundaries, in the past two approaches have been proposed:
(a) create a string in which the node boundaries are represented by some recognizable textual markup (you could use saxon:serialize()), then apply the regex processing, then reinstate the node structure (e.g. by using saxon:parse()).
(b) do a deep copy, while processing each of the text nodes to replace the significant features (such as end of sentence) by nodes (e.g. an <end-of-sentence/> element). Then apply positional grouping techniques to transform this into your target structure.
Neither is particularly easy, I'm afraid.
Cheers, Wendell
====================================================================== Wendell Piez mailto:wapiez@xxxxxxxxxxxxxxxx Mulberry Technologies, Inc. http://www.mulberrytech.com 17 West Jefferson Street Direct Phone: 301/315-9635 Suite 207 Phone: 301/315-9631 Rockville, MD 20850 Fax: 301/315-8285 ---------------------------------------------------------------------- Mulberry Technologies: A Consultancy Specializing in SGML and XML ======================================================================
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] segmenting a paragraph, Christian Wittern | Thread | Re: [xsl] segmenting a paragraph, David Carlisle |
Re: [xsl] Output input_xml apart fr, Martin Honnen | Date | RE: [xsl] Using the Input Document , Kerry, Richard |
Month |