Subject: Re: [xsl]Identifying patterns within texts|
From: Wendell Piez <wapiez@xxxxxxxxxxxxxxxx>
Date: Thu, 29 Nov 2007 17:41:05 -0500
The two responses I got (thank you....) reiterated the problem that I identified at the beginning of this project. How do you identify "math"? Since I am working on an educational tool where I am taking an old format (strings within xml tags)and converting to a new format(strings within new xml tags), it is tough to identify What is considered math. 1/2/99 Is that math or a date? Basically, I am to decide what represents a mathematical expression and place it within its own element/tag. Therefore, the software that processes it, will be able to display it in a 2-dimensional format....So the algorithm I come up with must be flexible and expandable. And it may not be perfect.... Spaces between text will be a killer.... I guess what is acceptable will be up to the Systems guys......
However, I have went on the path to choose the XSLT implementation that was used on the GNOME project (xmlsoft.org) which implements XSLT 1.0. The engine I chose must be easily added to an existing dll and later ported to a MAC library (now that Mac is very much Unix :) ). I needed something that was free, something the lawyers would approve the licensing, and something that would be portable among those two platforms. I have seen some Java and C++ (Xalan with Xerces) implementations, but I did not want the added tasks of integration (JNI and C++ bindings). Please comment on my logic if you see flaws.
Therefore, the idea of using xsl:analyze-string element or regular expressions in XSLT 2.0 is not an option right now.
I guess I could use a package like Boost/regex to post process my converted Xml. I assume I can generate the XML from the result tree in memory and then parse that looking for math using C.
====================================================================== Wendell Piez mailto:wapiez@xxxxxxxxxxxxxxxx Mulberry Technologies, Inc. http://www.mulberrytech.com 17 West Jefferson Street Direct Phone: 301/315-9635 Suite 207 Phone: 301/315-9631 Rockville, MD 20850 Fax: 301/315-8285 ---------------------------------------------------------------------- Mulberry Technologies: A Consultancy Specializing in SGML and XML ======================================================================