Subject: Re: [xsl] Pattern-Matching / Regular Expression Types From: Michael Kay <mike@xxxxxxxxxxxx> Date: Thu, 26 Apr 2012 23:28:24 +0100 |
Michael Kay Saxonica
I need to match patterns on a set of XML documents (all with the same schema), and when a pattern matches, I need to retrieve the content and do some specific transformations on that content (no xml output needed).
Specifically, they are natural language syntactic trees (and dependencies).
I will have a list of those "patterns", that are similar to regular expressions, but with elements and attributes.
pseudo-pattern example:
(//ELEMENTx) (node())* (//ELEMENTy[@ATTRIBUTEz]) (node())* (//@ATTRIBUTEw)
I used XPath syntax inside the parenthesis only. Other quantifiers could be used...and also specify dependencies between nodes/attributes, but that is another problem.
This example would match when the xml has ELEMENTx as the first element, ends with one element that has ATTRIBUTEw, and in between needs to have an ELEMENTy with ATTRIBUTEz.
Note that I need to match the whole document for each pattern, not just part of it.
The nesting of elements does not matter in this case (ELEMENTy could be a child of ELEMENTx, or not), but they need to have that specific order (in document order).
Example of tree that can appear: TOP / \ X Y | \ | \ 1 2 3 4
Matching patterns could be (node names, assuming no attributes): X Y 1 * Y X 3 4 1 * 4
I could use XPath to get each individual node in the pattern, but then I loose the order...if I do two XPath queries, I don't know the positions of the results relative to each other.
After matching, I will have rules for each pattern, that specify some transformations on the content (change order, etc).
Is there any way to do something like this using XSL, XQuery, or other language? (preferably available in a Java implementation)
Thanks for any pointers. (Is it ok to cross-post this to an XQuery list? Recommend any?)
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] Pattern-Matching / Regula, David Carlisle | Thread | Re: [xsl] Pattern-Matching / Regula, Dimitre Novatchev |
Re: [xsl] Pattern-Matching / Regula, David Carlisle | Date | Re: [xsl] Pattern-Matching / Regula, Dimitre Novatchev |
Month |