Re: [xsl] fixing XSL search using values from a variable against multiple XML files

Subject: Re: [xsl] fixing XSL search using values from a variable against multiple XML files
From: "Lizzi, Vincent vincent.lizzi@xxxxxxxxxxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Thu, 4 Oct 2018 21:51:27 -0000
Hi Dave,

The solution from Michael Kay is definitely on the right path. Which part of
your requirements is it not doing?

Could you describe the environment where the XSLT will be used? Does it need
to run in oXygen, which you mentioned, or inside another system? Are you using
Saxon-HE, Saxon-EE, or another XSLT processor?

It would be very easy to provide a solution using BaseX<http://basex.org/>,
which is another part of the XML ecosystem. Would it be worth exploring this
option?

I have found that many of the learning resources available for XSLT are not
great for beginners, but once you get familiar with a few of the tools
learning accelerates and XSLT can become part of your go-to toolkit. I can
recommend a current-favorite book, bBeginning XML, 5th
edition<http://www.wrox.com/WileyCDA/WroxTitle/Beginning-XML-5th-Edition.prod
uctCd-1118162137.html>b by Joe Fawcett, et al, published by Wrox. The XML
Summer School<https://xmlsummerschool.com/> offers a great hands-on course if
the time and location work for you, and there are other hands-on trainings
available from other sources.

Vincent


From: Dave Lang emaildavelang@xxxxxxxxx
<xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Sent: Thursday, October 04, 2018 12:59 PM
To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
Subject: Re: [xsl] fixing XSL search using values from a variable against
multiple XML files

Thanks for your reply, Syd.

re: doing this in another tool, I completely agree, but I have to use
xsl for this particular task at this particular time.

fwiw, I find xsl to be very difficult to learn in this case because it
is, at least as far as I can see, a pretty non standard usage for it.
There are very few resources available online with working examples to
learn from. The reddit xslt group has 50 subscribers. The two textbooks
I looked at didn't include very helpful working examples either.

The original solution that started me down this path was (generously)
presented to me on stackoverflow by Michael Kay. I'm sure some of you
will recognize that name. This was great, but he acknowledged after that
it was incomplete.

Anyway, thanks again for your help.

dave


On 2018-10-04 9:50 AM, Syd Bauman
s.bauman@xxxxxxxxxxxxxxxx<mailto:s.bauman@xxxxxxxxxxxxxxxx> wrote:
> I realize this is the XSL list, and don't get me wrong, I *love*
> XSLT. And while I'm singing XSLT's (and thus XPath's) praises, this
> particular task looks like a fun one to attack with Hans-JC<rgen's
> FOXpath (which is an extension of XPath to handle the file
> system).[1]
>
> But that said, this strikes me as a task better handled by your shell
> than you XSLT engine, no? In bash, e.g.,
> $ fgrep -f filenames_from_directory_listing.txt dir1/*.xml dir2/*.xml
> gives you the answer, as it were, but not in the format you want.
>
> I think to get the results you want (the phrase "[filename] was found
> in [filepath]") you have to issue the fgrep command once for each
> search term, instead of all-at-once. E.g., I think the following will
> do the trick.
> $ for fn in `cat filenames_from_directory_listing.txt` ; do fgrep -l -e $fn
dir1/*.xml dir2/*.xml | perl -pe "s,^.*\$,$fn was found in \$&,;" ; done
> These methods presume that none of the names in filenames_from_
> directory_ listing contain any whitespace.
>
> And, of course, one thing that makes this nice is by just using
> `egrep` instead of `fgrep`, you can search for regular expressions,
> e.g., "meeting_schema\.(rn[cg]|xsd?|wxs|odd|dtd|(iso)?sch)". :-)
>
> Notes
> -----
> [1] See
https://www.balisage.net/Proceedings/vol17/html/Rennau01/BalisageVol17-Rennau
01.html<https://www.balisage.net/Proceedings/vol17/html/Rennau01/BalisageVol1
7-Rennau01.html>
>
>> Hi this is my first post here - looking for help - apologies if
>> there's something I've overlooked!
>>
>> I have a tokenized variable that contains list of filenames from a
>> .txt of a directory listing. I want to look for those filenames in
>> a number of xml files in a number of subdirectories. If the
>> filename is found, I want to output that "filename" was found in
>> "xmlfile".
>>
>> There are a lot of xml directories and they are not static. Same
>> with xml files. The filenames are not tagged in the xml, so I'm
>> just looking for their plain text occurence in the file.
>>
>> Any help would be appreciated.
>>
>> to make the examples easier - I want to use
>>
>> $filenames_to_find (tokenized list of filenames from a .txt
>> directory listing)
>>
>> to search against
>>
>> dir1/*.xml
>> dir2/*.xml
>> with the output being
>>
>> filename was found in xmlfilename
>>
>> I'm using an academic version of Oxygen XML so I think I have Saxon
>> through that and I have the standalone Saxon file for running this
>> from the command line.
>>
>> I've gotten this far, but it doesn't work. I know it's broken, but
>> I don't know how to fix it!
>>
>> <xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform<http://www.w3.org/1999/XSL/Tr
ansform>"
>>
xmlns:xs="http://www.w3.org/2001/XMLSchema<http://www.w3.org/2001/XMLSchema>"
>>     xmlns:h="http://www.w3.org/1999/xhtml<http://www.w3.org/1999/xhtml>"
>>     exclude-result-prefixes="xs"
>>     version="3.0"
>>     expand-text="yes"
>>     >
>>
>>     <xsl:variable name="filenames_from_directory_listing"
>> as="xs:string"
>> select="unparsed-text('filenames_from_directory_listing.txt')"/>
>>     <xsl:variable name="filenames_to_find"
>> select="tokenize($filenames_from_directory_listing, '\s+')"/>
>>
>>     <xsl:template match="/">
>>         <xsl:for-each select="collection('.?select=*.xml;recurse=yes')"/>
>>             <xsl:variable name="xml_filenames" select="."/>
>>                 <xsl:for-each select="$filenames_to_find">
>>                     <xsl:if test="(contains($t, .))">
>> <xsl:message>{document-uri($xml_filenames)} contains {.}</xsl:message>
>>                     </xsl:if>
>>                 </xsl:for-each>
>>     </xsl:template>
>> </xsl:stylesheet>
>>
>> Any suggestions? Clearly I am an XSL novice. Thanks for your patience.

Current Thread