RE: Escaped Character Entity's and the MSXML3 parser..... was formerly RE: [xsl] ampersand output

Subject: RE: Escaped Character Entity's and the MSXML3 parser..... was formerly RE: [xsl] ampersand output
From: "Julian Reschke" <julian.reschke@xxxxxx>
Date: Fri, 19 Oct 2001 14:17:30 +0200
Again:

you obviously still don't understand the difference between the
representation in the XML serialization and the values that you can get and
set via the DOM. The format is different. This is by design. In the XML
serialization, characters like the ampersand are escaped. When accessing
them through the DOM, the aren't.

Julian


> -----Original Message-----
> From: owner-xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> [mailto:owner-xsl-list@xxxxxxxxxxxxxxxxxxxxxx]On Behalf Of
> jdgarrett@xxxxxxxxxx
> Sent: Friday, October 19, 2001 1:58 PM
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Subject: Escaped Character Entity's and the MSXML3 parser..... was
> formerly RE: [xsl] ampersand output
>
>
> Since most of this revolves around the specific
> use of Microsoft MSXML3 parser vs. xslt in general
> ....I believe the issue is similar to the one
> I am currently struggling with...(some on here
> have yet to comprehend my dilemma and the reply's
> indicate that fact)
>
> So "regardless" of what the XSLT spec currently
> states....the real world is having a difficult
> time in this one particular area using XML
> ...so before some readers say this scenario has been hashed
> before....a little time to make the case is requested...
>
>
> A common multi-application scenario would be that
> app#1 provides a web interface that accumulates
> data from a client so that they can customize
> the visual aspects of a web page for a web application
> ...i.e. app#1 (using a web interface) ask's the user
> to check off backgroundcolor's, font sizes, font types,
> font colors, hyperlink address's and image href's...etc
> and saves that all that off in an xml structure...
>
> <CustomOptionsForClientsWebPage>
> 		<tabledatacell number="1">
> 			<anImage footnote="Painted by Bill &amp;
> Jane Painter"
>
> href="http://someserver/images/MSASPApp.asp&amp;ImageName=bluegoos
> e">bluegoo
> se.jpg</anImage>
> 			<texttag>This is a blue goose</textag>
> 			<fontcolor>#123456</fontcolor>
> 			<fontname>verdana></fontname>
> 		</tabledatacell>
> 	</page>
> 	.
> 	.
> 	.
> </CustomOptionsForClientsWebPage>
>
> App#1 would also provide a preview mode for the client
> so the might look like at demo page before saving
> this off for later user in production...
>
> So the above structure is assembled either on the
> client and posted as one long xml structure or
> is assembled on a server ...in either case....the
> xml structure is created and stored in a database
> field....(as a binary type for those that wonder just how
> it is done).
>
>
> Now when preview is called upon, the preview mode
> call's upon App#2 to assemble the structure into
> a preview HTML page string ....the reason for App#2 to
> do this and not App#1 is that preview would be
> the same method for building a final page for production
> as it would be for preview ...all that is required
> is that right before App#2 response.writes  to the
> client in production mode, the string is routed
> to a new opennewwindow function on the client who
> is running App#1 (to allow them to preview their choices)
> and they see the production page in a seconde browser
> window as a preview....
>
> App#2 reads from the database field the xml structure
> and then loads that xml structure up in the MSXML3
> parser and then creates an XSLT page (or HTML page)
> >from many string bits with html tag attributes....
>
> And finally here is the problem with all this....
>
>
> When the program (App#2) assembles the XSLT or HTML
> string structure...
> 1...it first reads the xml structure from the database
> 2...it then loads the xml structure into MSXML3 parser (instance A)
> 3...it then navigates the xml structure via MSXML3 to the
>    property it needs to read
> 4...it then places value from the xml structure into
>    a HTML element attribute ...
> 5...and then will load that final structure up
>    into another instance of MSXML3 parser (instance B)
>    and transform it into HTML and response.write that to the client...
>
>
>
> seems fine doesn't it ....and does work very well until
> you run into the following snag....
>
>
>
> If you need to set some html property in 4 that has
> a character entity in it, then when you read it from parser
> instance A into the html element or attribute property,
> it is transformed by instance A and no longer is escaped...
> ...so that when parser instance B attempts to load it...
> ...parser instance B fails on the load ...
>
> e.g. the footnote attribute in the anImage element in
> the example above...
> parser A will load the following fine...
> 	footnote="Painted by Bill &amp; Jane Painter"
> parser B will not load the following
> footnote="Painted by Bill & Jane Painter"
>
> Now some on here have said ...double escape it ...meaning
> that you would escape twice so that when parser instance
> A loaded it, then it would still be escaped so that
> when parser instance B loaded it, then it would load ...
>
> e.g.
> footnote="Painted by Bill &amp;amp; Jane Painter"
>
> which would appear logical, except when you realize that
> the number of times the xml structure is loaded and
> read in App#1 would not be the same number as
> App#2, and therefore, double escaping would be out of
> sync between the 2 applications...
>
> So the builder of the applications is reduced to not
> being able to find a method by which they can reliably
> depend upon the MSXML3 parser to reliably provide
> a string value that has a character entity in it..
>
> the simple rule provided by Microsoft is ...
> if a xml element or attribute has a character entity
> in it   e.g.
>
> &
> <
> >
> '
>
> you can escape it once so that it will load successfully
> ..but if you try to read it out and then put it into
> another structure that will be subsequently loaded by
> any other instance of the MSXML3 parser, it of course
> will fail unless you do 1 of 5 things...
>
> 1.)
> escape the character entity x number of times in the original structure
> and then keep some counter somewhere that tells you
> how many times you have loaded it and read it out
> ...and know how many total times you will do this
> before you do some final thing with it...
>
> 2.)
> you write a nice little custom function that add's
> and removes unescaped character entitys after each
> read from a parser and before each load ....
> (oh hey nice boat anchor you wrapped on that
> MSXML3 speed boat there buddy...)
>
> 3.)
> use custom character entitys instead of what is provided for by the xml
> community...
>
> so instead of using   "&amp;" for an "&", you would use
> "JoeAmp&amp;JoeAmp", and then you would always have to send
> all the reads and writes from the parser to the function that
> add's or removes you little custom escape  (like that would make
> a nice improvment on scalability wouldn't it)
>
> 4.)
> load it only once in a single instance of the parser
> and develop only monolithic web applications
>
> 5.)
> don't use character entitys
>
> 6.)
> don't use MSXML3 nor XML nor XSLT...
>
>
> Well before I go let me just add again...
>
> that I did find that you could write to the text
> property of the MSXML3 parser after the intial load
> and once you wrote an escaped character entity to
> the text property it stayed escaped.....which was
> a bit odd since the text property automagically
> escaped the value when loading the original XML
> structure....
>
> The thing that is interesting is if the character
> entity will stay escaped after you load it and then
> subsequently immediately re-write the original value
> to it ....what is going on inside the MSXML3 parser
> to not cause the text value to transform after writing
> to the text value ....is there some secret switch...
>
> ...in a state of wonder...
>
> Thanks for the time for those who read this all the
> way through to understand some of things real world
> coders are dealing with in the production mode...
>
> Sincerely
> JDGarrett
> (p.s. forgive me of any typo's)
>
>
>
>
>
>
> |-----Original Message-----
> |From: owner-xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> |[mailto:owner-xsl-list@xxxxxxxxxxxxxxxxxxxxxx]On Behalf Of Julian
> |Reschke
> |Sent: Thursday, October 18, 2001 1:31 PM
> |To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> |Subject: RE: [xsl] ampersand output
> |
> |
> |Look at the FAQ again :-)
> |
> |Ampersands in the HTML src attribute MUST be escaped as "&amp;"
> (otherwise
> |it's invalid HTML).
> |
> |
> |> -----Original Message-----
> |> From: owner-xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> |> [mailto:owner-xsl-list@xxxxxxxxxxxxxxxxxxxxxx]On Behalf Of
> Eric Vitiello
> |> Sent: Thursday, October 18, 2001 8:07 PM
> |> To: xsl-list
> |> Subject: [xsl] ampersand output
> |>
> |>
> |> I've been following the &nbsp; thread, and looked up my question
> |> in the FAQ, but I've been unable to find an answer.
> |>
> |> I've also seen some messages with examples of exactly what I'm
> |> trying to do, but they aren't working...
> |>
> |> I'm trying to output the following stylesheet:
> |>
> |> <?xml version="1.0"?>
> |> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
> |> version="1.0">
> |> <xsl:output method="html"/>
> |>
> |>   <xsl:template match="/family-tree">
> |>     <html>
> |>       <body>
> |>       	<embed
> |> src="/default.asp?{'person=p1'}{'&amp;tree='}{@surname}"
> |> width="600" height="300" type="image/svg+xml"/>
> |>       </body>
> |>     </html>
> |>   </xsl:template>
> |>
> |>
> |> the problem is the SRC tag.  instead of outputting a & it is
> |> outputting &amp;  so the output looks like:
> |>
> |> <html>
> |> 	<body>
> |> 		<embed
> |> src="/default.asp?person=p1&amp;tree=vitiello" width="600"
> |> height="300" type="image/svg+xml"/>
> |> 	</body>
> |> </html>
> |>
> |> I have also tried &#38;  but it also outputs &amp;
> |>
> |> I am using MSXML 3.0.
> |>
> |> any ideas?
> |>
> |> Eric Vitiello
> |> perceive designs
> |> <www.perceive.net>
> |>
> |>
> |>  XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
> |>
> |
> |
> | XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
> |
>
>
>
>  XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
>


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread