RE: [xsl] first question to the list: contains

Subject: RE: [xsl] first question to the list: contains
From: "Scott Trenda" <Scott.Trenda@xxxxxxxx>
Date: Fri, 2 Nov 2007 17:39:29 -0500
The problem here is that XSLT operates on the expanded infoset of the
parsed XML file, and is blind to the literal text that created that
infoset. That is, if your XML file includes those characters (e.g.
<wording>123&#12360;456</wording>), then your XSLT won't see the
entity's literal text (&amp;#12360;), it'll see that entity's value (the
Japanese character). Likewise, if you use an entity in your XSLT
stylesheet, then the processor will use the value of that entity, rather
than the text you used to represent it.

So, if you know the characters that you need to check for, you can do
something like this:
<xsl:if test="translate(wording,
'&#12360;&#12361;&#12362;&#12363;&#12364;&#12365;&#12366;&#12367;&#12368
;&#12369;&#12370;&#12371;&#12372;&#12373;', '') != string(wording)">
	<xsl:attribute name="class">
		<xsl:text>fs front jpns</xsl:text>
	</xsl:attribute>
</xsl:if>

This test just wipes out the numerals in question, and checks to see if
its string-value is any different than before.

I guessed at the entity code numbers here - you'll have to look up their
actual ones and substitute them in. If you're checking for Japanese
numerals, I'm guessing you can narrow the check down to ichi (&#12360;),
ni (&#12361;), san (&#12362;), yon (&#12363;), go (&#12364;), roku
(&#12365;), shichi (&#12366;), hachi (&#12367;), kyuu (&#12368;), juu
(&#12369;), hyaku (&#12370;), sen (&#12371;), man (&#12372;), and oku
(&#12373;).

~ Scott


-----Original Message-----
From: Jared Stein [mailto:STEINJA@xxxxxxxx]
Sent: Friday, November 02, 2007 5:23 PM
To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
Subject: [xsl] first question to the list: contains

Hi folks, I'm pretty new to XSL, at least to using it seriously for
projects, and I've hit a snag that I can't seem to figure out. It's
probably pretty simple, but I'm not 100% clear on how escaping works in
XSL.

Basically we've got some decimal unicode Japanese characters in our XML
file, e.g. &#12360;.  I need to test for these inside of an element so I
can add a CLASS attribute to the DIV that will contain them. I'm
confident there are no other unicode elements in the document, and so I
tried to do something as simple as this:

<xsl:variable name="jpstr">
	<xsl:text>&amp;&#35;</xsl:text>
</xsl:variable>

<div>
<xsl:if test="contains(wording,$jpstr)">
	<xsl:attribute name="class">
		<xsl:text>fs front jpns</xsl:text>
	</xsl:attribute>
</xsl:if>
...etc

If I change $jpstr's value to normal letters it works just fine, but
for some reason I can't get it to detect  either &# or &amp;&#35;.  Any
help is greatly appreciated.

Jared Stein

   Director of Instructional Design Services
   Utah Valley State College, MS 149

Teaching w/ Technology Idea Exchange 2008: The Open Conference
   Submit your presentation proposal now! http://www.ttix.org

Current Thread