[xsl] regex and tab-delimited text

Subject: [xsl] regex and tab-delimited text
From: Terry Ofner <tofner@xxxxxxxxxxx>
Date: Thu, 17 Jan 2008 15:26:42 -0500
Forgive me if this topic has already been discussed. My query has to do with the lack of the \t construct in regular expressions.

I will preface my remarks by saying that I am used to using the grep patters in BBEdit on a Mac. As a matter of course, I regularly use \t to select or insert a tab character. In the various places I looked, (XSLT 2.0: Programmer's Reference, and google searches), I could find no examples for parsing tab-delimited text files. I finally decided to replace \s with the escape sequence for the tab (&#09;). When I edited my stylesheet I made an error. I left the <<\>> in, creating this escape sequence: \&#09;

<xsl:analyze-string select="$in" regex="\&#09;A&#09;(.*)&#10;">

Saxon8 produced this error message:

XTDE1140: Error in regular expression: net.sf.saxon.trans.DynamicError: Error at character
1 in regular expression "\\tA\t(.*)\n": invalid escape sequence


I removed the offending <<\>> and the transformation worked as advertised.

My question has to do with the fact that \t does not seem to work in the stylesheet, but Saxon uses \t to report the error. I understand that the regular expressions used by XSLT 2.0 are dictated by XPath 2.0, so perhaps this is out of Saxon's hands. I was just wondering why the construct is not available.

Terry

Current Thread