Subject: Re: [xsl] Pattern-Matching / Regular Expression Types From: Dimitre Novatchev <dnovatchev@xxxxxxxxx> Date: Thu, 26 Apr 2012 15:39:01 -0700 |
There is a LR-1 generic, table-driven parser in FXSL --anyone is welcome to use it. Dimitre. On Thu, Apr 26, 2012 at 3:28 PM, Michael Kay <mike@xxxxxxxxxxxx> wrote: > > You could try taking a look at Gunther Rademacher's REX parser generator. > I've found it hard to find information about it, other than mentions by > people who have used it for some rather interesting projects. Basically, if > I understand it correctly, given an EBNF grammar, it generates a parser for > that grammar written in XQuery. Most of the examples seem to be parsers for > textual languages (i.e. where the tokens of the language being parsed are > made up of characters) but I don't see any reason in principle why it > shouldn't also parse a language where the tokens of the language are element > nodes. > > Michael Kay > Saxonica > > > On 26/04/2012 22:45, Tiago Freitas wrote: >> >> I need to match patterns on a set of XML documents (all with the same >> schema), and when a pattern matches, I need to retrieve the content >> and do some specific transformations on that content (no xml output >> needed). >> >> Specifically, they are natural language syntactic trees (and >> dependencies). >> >> I will have a list of those "patterns", that are similar to regular >> expressions, but with elements and attributes. >> >> pseudo-pattern example: >> >> (//ELEMENTx) (node())* (//ELEMENTy[@ATTRIBUTEz]) (node())* >> (//@ATTRIBUTEw) >> >> I used XPath syntax inside the parenthesis only. Other quantifiers >> could be used...and also specify dependencies between >> nodes/attributes, but that is another problem. >> >> This example would match when the xml has ELEMENTx as the first >> element, ends with one element that has ATTRIBUTEw, and in between >> needs to have an ELEMENTy with ATTRIBUTEz. >> >> Note that I need to match the whole document for each pattern, not >> just part of it. >> >> The nesting of elements does not matter in this case (ELEMENTy could >> be a child of ELEMENTx, or not), but they need to have that specific >> order (in document order). >> >> Example of tree that can appear: >> TOP >> B / \ >> X B Y >> | \ B | \ >> 1 2 3 4 >> >> Matching patterns could be (node names, assuming no attributes): >> X Y >> 1 * Y >> X 3 4 >> 1 * 4 >> >> I could use XPath to get each individual node in the pattern, but then >> I loose the order...if I do two XPath queries, I don't know the >> positions of the results relative to each other. >> >> After matching, I will have rules for each pattern, that specify some >> transformations on the content (change order, etc). >> >> Is there any way to do something like this using XSL, XQuery, or other >> language? (preferably available in a Java implementation) >> >> Thanks for any pointers. >> (Is it ok to cross-post this to an XQuery list? Recommend any?) > -- Cheers, Dimitre Novatchev --------------------------------------- Truly great madness cannot be achieved without significant intelligence. --------------------------------------- To invent, you need a good imagination and a pile of junk ------------------------------------- Never fight an inanimate object ------------------------------------- To avoid situations in which you might make mistakes may be the biggest mistake of all ------------------------------------ Quality means doing it right when no one is looking. ------------------------------------- You've achieved success in your field when you don't know whether what you're doing is work or play ------------------------------------- Facts do not cease to exist because they are ignored. ------------------------------------- I finally figured out the only reason to be alive is to enjoy it.
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] Pattern-Matching / Regula, Michael Kay | Thread | [xsl] Incorrect colname attribute , Joga Singh Rawat |
Re: [xsl] Pattern-Matching / Regula, Michael Kay | Date | RE: [xsl] Help parsing a node, Michele R Combs |
Month |