Subject: RE: [xsl] Yet Another Entity Ref question! From: Wendell Piez <wapiez@xxxxxxxxxxxxxxxx> Date: Fri, 20 Dec 2002 13:13:40 -0500 |
(1) How to have your users represent special characters a. <entity>ent</entity>, or b. &ent;
(2) How to get these entities represented as "true" entities, rather than numerica character references, in the output of your transformation
Cheers, Wendell
Hi Michael, maybe my brain has been drinking!!! :-P ...maybe I didn't explain what I would like to produce very well! So, let's repeat again: User may write an XML, possibly with reference to entity. So there are two solutions: A) include all the possible symbolic entity in the DTD: <!ENTITY ent "&#xNN;"> and then produce the output with these entities either encoded or not (using "us-ascii" enconding) B) the user will use a special XML element, say "entity", to refer to the entity: <entity>ent</entity> Through XSLT this will be managed so that the resulting output contains the entity. To manage this you have two possibilities: B.1) the solution previously proposed by David, that's to insert an entity dictionary in the XML doc and referring to it with an XPath query. B.2) the solution originally proposed by me, that's to construct the entity by outputting "&", "ent", ";".
Here below my considerations; the (+) symbol means good while the (-) means bad. Solution A: (+) is the more elegant solution (+) in general faster than B (at least than B.1), even if ... [see (-)s below] (-) consumes more memory to store the entities (-) I have to take care about writing down all possible symbolic entities to construct the DTD (-) even if the user does not insert entities, the document will contain DTD, consuming time and space (the DTD infact will be automatically included in the doc by the content management engine).
Solution B.1: (-) conceptually is equivalent to "A"; however, instead of having the entities representations in the DTD, it stores them in XML elements; So, in addition to disadvantages came from "A", I add the one regarding the template processing in XSLT; furthermore, respect to "B.2" we have one more Xpath query.
Solution B.2: (+) I haven't to take care about writing down all possible symbolic entities (+) don't consume additional memory (except that for storing the template) (-) produces a not-well formed document, since it wants to output the "&" symbol. (-) in general is less faster than "A" since we have to apply the template; however when the user does not insert an entity, the XML parser don't have to parse the DTD for the entities (like "A"). (-) It seems that when the output of the entity template processing is stored in a variable I have to use xsl:value-of with d-o-e (see my original email).
Thanks very much!!!
-------------------------------- Marco Guazzone Software Engineer Kerbero S.r.L. - Gruppo TC Viale Forlanini, 36 Garbagnate M.se (MI) 20024 - Italy mail: marco.guazzone@xxxxxxxxxxx www: http://www.kerbero.com Tel. +39 02 99514.247 Fax. +39 02 99514.399 --------------------------------
On Fri, 20 Dec 2002, Michael Kay wrote:
> You are making things ridiculously complicated. If you are producing > output that you want to view in an editor that can't understand UTF-8, > just set <xsl:output encoding="us-ascii"/> as you were initially > advised. > > Michael Kay > Software AG > home: Michael.H.Kay@xxxxxxxxxxxx > work: Michael.Kay@xxxxxxxxxxxxxx > > > -----Original Message----- > > From: owner-xsl-list@xxxxxxxxxxxxxxxxxxxxxx > > [mailto:owner-xsl-list@xxxxxxxxxxxxxxxxxxxxxx] On Behalf Of > > Marco Guazzone > > Sent: 20 December 2002 11:55 > > To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx > > Subject: Re: [xsl] Yet Another Entity Ref question! > > > > > > Hi David, > > your idea is good. > > Currently I'm using the LibXSLT processor version 1.0.23 > > (with libxml2 version 2..4.30). However, with this I will > > produce the encode UNICODE char in the output > > (HTML) doc; i.e. > > XML: > > <doc> > > <label>Foobar<entity>copy</entity></label> > > <entity-dict> > > <entity-item name="copy" value="©" /> > > </entity-dict> > > </doc> > > > > XSL: > > <!-- ... like the previous except for: --> > > <xsl:template match="entity"> > > <xsl:value-of select="/doc/entity-dict/entity-item[@name = > > current()]/@value" /> </xsl:template> > > > > This produce as output > > Foobar(C)Foobar(C)Foobar(C) > > where (C) is the encoded value of © > > This may cause problem in non-UNICODE editors or browser, > > especially if I include the result in a source document (e.g. > > Perl, C) as a return value of a function (problems may arise > > in compiling/interpreting phase). Instead what I would > > generate is: Foobar© or more generally: Foobar&ent; > > where "ent" is specified by an anonymous user in XML via: > > <entity>ent</entity> What do you think about it? > > > > -------------------------------- > > Marco Guazzone > > Software Engineer > > Kerbero S.r.L. - Gruppo TC > > Viale Forlanini, 36 > > Garbagnate M.se (MI) > > 20024 - Italy > > mail: marco.guazzone@xxxxxxxxxxx > > www: http://www.kerbero.com > > Tel. +39 02 99514.247 > > Fax. +39 02 99514.399 > > -------------------------------- > > > > On Fri, 20 Dec 2002, David Carlisle wrote: > > > > > <xsl:template match="doc"> > > > <xsl:apply-templates select="label" /> <!-- ok! --> > > > <xsl:variable name="label"> > > > <xsl:apply-templates select="label" /> > > > </xsl:variable> > > > <xsl:value-of select="$label" /> <!-- not ok --> > > > <xsl:value-of disable-output-escaping="yes" select="$label" /> > > > <!-- ok > > > --> > > > </xsl:template> > > > > > > which processor are you using? > > > > > > d-o-e is optional so a processor can ignore it altogether, > > but if it > > > supports it at all I think that in xslt1 the character > > should keep the > > > d-o-e property even when it goes through the variable. > > > > > > Is your input form fixed? > > > > > > It would be easier if your > > > <ent>xxx</ent> > > > only took entity names, as then you could convert them easily to > > > characters without using d-o-e just by looking them up in a > > document > > > of the form > > > > > > <entity name="copy" char="©"/> > > > ... > > > > > > > > > There is no need to have an input form of > > > <ent>#x0A</ent> > > > > > > as the user can more simply write > > > 
 > > > which then doesn't need any processing at all at the xslt level. > > > > > > David > > > > > > > > _____________________________________________________________________ > > > This message has been checked for all known viruses by Star > > Internet > > > delivered through the MessageLabs Virus Scanning Service. > > For further > > > information visit http://www.star.net.uk/stats.asp or alternatively > > > call Star Internet for details on the Virus Scanning Service. > > > > > > XSL-List info and archive: > > http://www.mulberrytech.com/xsl/xsl-list > > > > > > > > > > > > XSL-List > > info and archive: http://www.mulberrytech.com/xsl/xsl-list > > > > > XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list > >
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
====================================================================== Wendell Piez mailto:wapiez@xxxxxxxxxxxxxxxx Mulberry Technologies, Inc. http://www.mulberrytech.com 17 West Jefferson Street Direct Phone: 301/315-9635 Suite 207 Phone: 301/315-9631 Rockville, MD 20850 Fax: 301/315-8285 ---------------------------------------------------------------------- Mulberry Technologies: A Consultancy Specializing in SGML and XML ======================================================================
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
RE: [xsl] Yet Another Entity Ref qu, Marco Guazzone | Thread | Re: [xsl] Yet Another Entity Ref qu, David Carlisle |
Re: [xsl] Yet Another Entity Ref qu, Wendell Piez | Date | RE: [xsl] XSL Message, Chuck White |
Month |