Subject: [xsl] XSLT match with regex what's the best current solution?|
From: Gunther Schadow <gunther@xxxxxxxxxxxxxxxxxxxxxx>
Date: Mon, 14 Jan 2002 17:45:35 -0500
I am working on a suite of scripts that induce structure in free text and eventually capture fine grained medical information. I have been using AWK so far, but I am thinking about making this a process largely of XML transformations. However, since I must induce XML structure from semi-structured free text I need some more parsing support. First, regular expressions. I know there is EXSLT but are regex matches and replaces supported in SAXON (I love SAXON, so I would prefer using it.)
Also, any ideas of additional parsing tools and their integration into XSLT would be appreciated. Is there a way of running XSLT in line-mode and have every line matched against regular expressions? Well, I suppose so, with a simple sed script I could first wrap each line into a <line>...</line> tag and then use regex match on the text node of each <line> element.
Is SAXON easy to extend? I suppose there is some documentation of SAXON that tells me how to write extensions in Java, right? Any reason why it would be better to use something other than SAXON if my platform is Java and I'm not interested in Web stuff (in which case I would look into the Apache work.)
thanks for your ideas, -Gunther
-- Gunther Schadow, M.D., Ph.D. gschadow@xxxxxxxxxxxxxxx Medical Information Scientist Regenstrief Institute for Health Care Adjunct Assistant Professor Indiana University School of Medicine tel:1(317)630-7960 http://aurora.regenstrief.org