Subject: [xsl] Second try: Search and replace many strings that may not be present in target From: Zack Brown <zbrown@xxxxxxxxxxxxxxx> Date: Mon, 20 May 2002 06:16:09 -0700 |
Hi folks, I thought I'd try sending this out again before using one of my molasses solutions. ----- Forwarded message from Zack Brown <zbrown@xxxxxxxxxxxxxxx> ----- To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx Reply-To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx Subject: [xsl] Search and replace many strings that may not be present in target From: Zack Brown <zbrown@xxxxxxxxxxxxxxx> Date: Fri, 17 May 2002 18:43:02 -0700 Hi folks, I'm trying to reproduce a feature using XSLT that I had working when I used my deeply broken home-grown XML parser. I'm moving to 'xsltproc', and GNU Make, which has so far shown itself equal to all challenges (thanks to some help ;-). Situation: I have a number of files that each contain a root element <kc>, with a number of <section> elements. Within each <section> element there may be a number of <quote who="firstname lastname">text</quote> elements. Several instances of the raw text "firstname lastname" may also appear in the raw text of each <section> tag. A "firstname lastname" text is only significant to this feature if it has also appeared identically in a <quote>'s "who" attribute in at least one of the files under consideration. Problem: Here is the feature: for each <section> tag in each file, I would like to do a search and replace on the first occurrence of each "firstname lastname" appearing in raw text. Example: Assume that the <quote>'s "who" attributes in the various files have named "Tom Jones", "Terry Haywood", and "Isaac Asimov". And assume the following <section> tag in one of the files: ------sample input------ <section> <p>this is a section containing a name, George Eliot, that has not appeared in a <quote> tag. Therefore it will not be acted on by this feature.</p> <p>This paragraph contains a <quote> tag naming Isaac Asimov, thus: <quote who="Isaac Asimov">And here he is saying something. Hi Mom!</quote></p> <p>this paragraph contains a reference to Terry Haywood, who appears in a <quote> tag in a different file. Here is another reference to Isaac Asimov, but it should not be matched, because only the first occurrence of a given name in a section should be matched.</p> </section> ------------------------ In the above sample, only Isaac Asimov and Terry Haywood should be identified. Tom Jones does not appear in the sample, so the search-and-replace will not find him. Also, George Eliot appears in the sample, but is not in the list of names that have appeared in <quote> tags in one of the files, so she will also not be found by the search and replace. Assuming that the search and replace will insert a link to another page corresponding to the name, then the output from the sample input would look like this: ---- sample output ----- <section> <p>this is a section containing a name, George Eliot, that has not appeared in a <quote> tag. Therefore it will not be acted on by this feature.</p> <p>This paragraph contains a <quote> tag naming Isaac Asimov [<a href="people/Isaac_Asimov.html">*</a>], thus: <quote who="Isaac Asimov">And here he is saying something. Hi Mom!</quote></p> <p>this paragraph contains a reference to Terry Haywood [<a href="people/Terry_Haywood.html">*</a>], who appears in a <quote> tag in a different file. Here is another reference to Isaac Asimov, but it should not be matched, because only the first occurrence of a given name in a section should be matched.</p> </section> ------------------------ Partial solution: The assumption I've been making is that I will do a first pass through all files to create metafiles, containing lists of all names appearing in <quote> tags in all files. Then these files will be concatenated into a single XML file. I will then do a second pass, in which I process all files for HTML output. The XSLT will also use document() to read in the large file just created. That will theoretically give it all the data it needs to do the search and replace. At that point my ideas break down. I can think of some very slow solutions, but nothing that would be feasible for a situation in which there are hundreds of files and thousands of names and a pentium III processor. Thanks a lot for any help. Zack -- Zack Brown XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list ----- End forwarded message ----- -- Zack Brown XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] removing nesting in data , Mike Brown | Thread | RE: [xsl] Second try: Search and re, Stuart Celarier |
RE: [xsl] creating list from struct, David Santamauro | Date | RE: [xsl] Mozilla 1.0 rc2 Problems, Brinkman, Theodore |
Month |