[xsl] Re: Regular expression to exclude files

Subject: [xsl] Re: Regular expression to exclude files
From: "Chris Papademetrious christopher.papademetrious@xxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Thu, 16 Feb 2023 20:43:36 -0000
Hi Eliot,

Normally I would use a negative lookahead for this, which requires the ";j"
flag for match():

matches(., '^(?!foo|bar).*\.dita', ';j')

The documentation at

https://www.saxonica.com/documentation12/#!sourcedocs/collections/collection-
directories

suggests that collection() uses the Java regex engine, so maybe it will work
there too.


  *   Chris


From: Eliot Kimber eliot.kimber@xxxxxxxxxxxxxx
<xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Sent: Thursday, February 16, 2023 3:31 PM
To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
Subject: [xsl] Regular expression to exclude files

I'm using Saxon's collection() extension that lets you specify a regular
expression to select files within a directory. These are XPath regular
expressions so my question is I think a general XPath question.

I want to match all files with a given extension except those that start with
"foo" or "bar".

I think the Perl expression would be something like:

'.*?!(foo|bar).+.ditamap'


Using this little XQuery:

let $strings as xs:string* := ('bundle-aaaa.ditamap',
'publication_pub-one.ditamap', 'not-pub-or-bundle.ditamap', 'atopic.dita')
return
count($strings[matches(., '.?!(bundle-|publication_).+\.ditamap')])


I get zero results, while this:

let $strings as xs:string* := ('bundle-aaaa.ditamap',
'publication_pub-one.ditamap', 'not-pub-or-bundle.ditamap', 'atopic.dita')
return
count($strings[matches(., '.+\.ditamap')]

Returns the expected 3

Reading the XSD regular expression spec I did not see an obvious way to
specify this kind of negative match but I also find the XSD specification to
be almost impenetrably difficult to decode.

Is there a way to do this with regular expressions alone?

I want a pure regex solution because I'm using it in the context of an Oxygen
xpath_eval() call so it's not easy (but not impossible) to filter the files
returned by the collection() call (I'm using the metadata=yes form since I
want the file names, not the parsed docs in this context).

Thanks,

E.
_____________________________________________
Eliot Kimber
Sr Staff Content Engineer
O: 512 554 9368
M: 512 554 9368
servicenow.com<https://urldefense.com/v3/__https:/www.servicenow.com__;!!A4F2
R9G_pg!akqpZDBq-Ha0QrxUI-t4RNrfD_4DQJqLN-uAiK-6TX_xvc9DVR44A8Lz_WyoitQhFXuc32
3b_3PSTy6cxQENmv8190vT9Lay1b7BwHa1dCTaVLltedpR$>
LinkedIn<https://urldefense.com/v3/__https:/www.linkedin.com/company/servicen
ow__;!!A4F2R9G_pg!akqpZDBq-Ha0QrxUI-t4RNrfD_4DQJqLN-uAiK-6TX_xvc9DVR44A8Lz_Wy
oitQhFXuc323b_3PSTy6cxQENmv8190vT9Lay1b7BwHa1dCTaVIQRTDJd$> |
Twitter<https://urldefense.com/v3/__https:/twitter.com/servicenow__;!!A4F2R9G
_pg!akqpZDBq-Ha0QrxUI-t4RNrfD_4DQJqLN-uAiK-6TX_xvc9DVR44A8Lz_WyoitQhFXuc323b_
3PSTy6cxQENmv8190vT9Lay1b7BwHa1dCTaVK2WY-FH$> |
YouTube<https://urldefense.com/v3/__https:/www.youtube.com/user/servicenowinc
__;!!A4F2R9G_pg!akqpZDBq-Ha0QrxUI-t4RNrfD_4DQJqLN-uAiK-6TX_xvc9DVR44A8Lz_Wyoi
tQhFXuc323b_3PSTy6cxQENmv8190vT9Lay1b7BwHa1dCTaVMQGbS8o$> |
Facebook<https://urldefense.com/v3/__https:/www.facebook.com/servicenow__;!!A
4F2R9G_pg!akqpZDBq-Ha0QrxUI-t4RNrfD_4DQJqLN-uAiK-6TX_xvc9DVR44A8Lz_WyoitQhFXu
c323b_3PSTy6cxQENmv8190vT9Lay1b7BwHa1dCTaVGBaPquY$>
XSL-List info and
archive<https://urldefense.com/v3/__http:/www.mulberrytech.com/xsl/xsl-list__
;!!A4F2R9G_pg!akqpZDBq-Ha0QrxUI-t4RNrfD_4DQJqLN-uAiK-6TX_xvc9DVR44A8Lz_WyoitQ
hFXuc323b_3PSTy6cxQENmv8190vT9Lay1b7BwHa1dCTaVPLjAUUF$>
EasyUnsubscribe<https://urldefense.com/v3/__http:/lists.mulberrytech.com/unsu
b/xsl-list/3380743__;!!A4F2R9G_pg!akqpZDBq-Ha0QrxUI-t4RNrfD_4DQJqLN-uAiK-6TX_
xvc9DVR44A8Lz_WyoitQhFXuc323b_3PSTy6cxQENmv8190vT9Lay1b7BwHa1dCTaVMb8uVuz$>
(by email<>)

Current Thread