Re: [xsl] fixing XSL search using values from a variable against multiple XML files

Subject: Re: [xsl] fixing XSL search using values from a variable against multiple XML files
From: "Dave Lang emaildavelang@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Thu, 4 Oct 2018 16:58:15 -0000
Thanks for your reply, Syd.

re: doing this in another tool, I completely agree, but I have to use xsl for this particular task at this particular time.

fwiw, I find xsl to be very difficult to learn in this case because it is, at least as far as I can see, a pretty non standard usage for it. There are very few resources available online with working examples to learn from. The reddit xslt group has 50 subscribers. The two textbooks I looked at didn't include very helpful working examples either.

The original solution that started me down this path was (generously) presented to me on stackoverflow by Michael Kay. I'm sure some of you will recognize that name. This was great, but he acknowledged after that it was incomplete.

Anyway, thanks again for your help.

dave


On 2018-10-04 9:50 AM, Syd Bauman s.bauman@xxxxxxxxxxxxxxxx wrote:
I realize this is the XSL list, and don't get me wrong, I *love*
XSLT. And while I'm singing XSLT's (and thus XPath's) praises, this
particular task looks like a fun one to attack with Hans-JC<rgen's
FOXpath (which is an extension of XPath to handle the file
system).[1]

But that said, this strikes me as a task better handled by your shell
than you XSLT engine, no? In bash, e.g.,
   $ fgrep -f filenames_from_directory_listing.txt dir1/*.xml dir2/*.xml
gives you the answer, as it were, but not in the format you want.

I think to get the results you want (the phrase "[filename] was found
in [filepath]") you have to issue the fgrep command once for each
search term, instead of all-at-once. E.g., I think the following will
do the trick.
    $ for fn in `cat filenames_from_directory_listing.txt` ; do fgrep -l -e $fn dir1/*.xml dir2/*.xml | perl -pe "s,^.*\$,$fn was found in \$&,;" ; done
These methods presume that none of the names in filenames_from_
directory_ listing contain any whitespace.

And, of course, one thing that makes this nice is by just using
`egrep` instead of `fgrep`, you can search for regular expressions,
e.g., "meeting_schema\.(rn[cg]|xsd?|wxs|odd|dtd|(iso)?sch)". :-)

Notes
-----
[1] See https://www.balisage.net/Proceedings/vol17/html/Rennau01/BalisageVol17-Rennau01.html

Hi this is my first post here - looking for help - apologies if
there's something I've overlooked!

I have a tokenized variable that contains list of filenames from a
.txt of a directory listing. I want to look for those filenames in
a number of xml files in a number of subdirectories. If the
filename is found, I want to output that "filename" was found in
"xmlfile".

There are a lot of xml directories and they are not static. Same
with xml files. The filenames are not tagged in the xml, so I'm
just looking for their plain text occurence in the file.

Any help would be appreciated.

to make the examples easier - I want to use

$filenames_to_find (tokenized list of filenames from a .txt
directory listing)

to search against

dir1/*.xml
dir2/*.xml
with the output being

filename was found in xmlfilename

I'm using an academic version of Oxygen XML so I think I have Saxon
through that and I have the standalone Saxon file for running this
from the command line.

I've gotten this far, but it doesn't work. I know it's broken, but
I don't know how to fix it!

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
  B B B  xmlns:xs="http://www.w3.org/2001/XMLSchema";
  B B B  xmlns:h="http://www.w3.org/1999/xhtml";
  B B B  exclude-result-prefixes="xs"
  B B B  version="3.0"
  B B B  expand-text="yes"
  B B B  >

  B B B  <xsl:variable name="filenames_from_directory_listing"
as="xs:string"
select="unparsed-text('filenames_from_directory_listing.txt')"/>
  B B B  <xsl:variable name="filenames_to_find"
select="tokenize($filenames_from_directory_listing, '\s+')"/>

  B B B  <xsl:template match="/">
  B B B B B B B  <xsl:for-each select="collection('.?select=*.xml;recurse=yes')"/>
  B B B B B B B B B B B  <xsl:variable name="xml_filenames" select="."/>
  B B B B B B B B B B B B B B B  <xsl:for-each select="$filenames_to_find">
  B B B B B B B B B B B B B B B B B B B  <xsl:if test="(contains($t, .))">
<xsl:message>{document-uri($xml_filenames)} contains {.}</xsl:message>
  B B B B B B B B B B B B B B B B B B B  </xsl:if>
  B B B B B B B B B B B B B B B  </xsl:for-each>
  B B B  </xsl:template>
</xsl:stylesheet>

Any suggestions? Clearly I am an XSL novice. Thanks for your patience.

Current Thread