Subject: Re: [xsl] Converting embedded URLs int hot links via XSL From: David Carlisle <davidc@xxxxxxxxx> Date: Mon, 19 Apr 2004 10:28:17 +0100 |
<xsl:template match="/vs/url"> <xsl:message> <xsl:value-of select="matches (., '(http|https|ftp)://((([a-z_0-9\-]+)+(([:]?)+([a-z_0-9\-]+))?)(@+)?)?(((((([ 0-1])?([0-9])?[0-9])|(2[0-4][0-9])|(2[0-5][0-5]))).(((([0-1])?([0-9])?[0-9]) |(2[0-4][0-9])|(2[0-5][0-5]))).(((([0-1])?([0-9])?[0-9])|(2[0-4][0-9])|(2[0- 5][0-5])))\.(((([0-1])?([0-9])?[0-9])|(2[0-4][0-9])|(2[0-5][0-5]))))|((([a-z 0-9\-])+.)+([a-z]{2}.[a-z]{2}|[a-z]{2,4})))(([:])(([1-9]{1}[0-9]{1,3})|([1-5 ]{1}[0-9]{2,4})|(6[0-5]{2}[0-3][0-6])))?(/)?$')"/> </xsl:message> </xsl:template> I don't think that's really the regexp you want to use, apart frm the fact that it only allows 0-9 and a-z (not even A-Z). It is only matching up to the first optional / so basically just the host name part of the URL plyus optional port specifier. The usual convention in xml files (as used for example in SYSTEM identifiers specified in the xML REC) is to allow arbitrary unicode characters in the document (so called IRI's) but to assume (hope) that the system %-encodes the utf8 representation of those characters before passing the URI to a URI handler. That said you don't want to use "tokenize() for this you want to use xsl:analyze-string which should give you a handle on the bits of the data matching and not-matching your regexp. I'd use a fairly permissive regexp something like [a-z]+://[^ ()"']+ ie everything from foo:// to the next space or bracket or quote character. David ________________________________________________________________________ This e-mail has been scanned for all viruses by Star Internet. The service is powered by MessageLabs. For more information on a proactive anti-virus service working around the clock, around the globe, visit: http://www.star.net.uk ________________________________________________________________________
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
RE: [xsl] Converting embedded URLs , David . Pawson | Thread | RE: [xsl] Converting embedded URLs , David . Pawson |
RE: [xsl] Converting embedded URLs , David . Pawson | Date | RE: [xsl] Converting embedded URLs , Ragulf Pickaxe |
Month |