Re: [xsl] Integrating a Search and Replace template with the CSV to XML converter

Subject: Re: [xsl] Integrating a Search and Replace template with the CSV to XML converter
From: Marney Cotterill <marney@xxxxxxxxxxxxxxxxxxxx>
Date: Tue, 03 Jun 2008 17:56:49 +1000
Thank you so much Michael for you detailed response!

I have thrown myself into XSLT and XML without any prior knowledge, and seem
to have missed quite a few of the basics!

I will look further into encoding types for my own benefit, but what you
have suggested below works absolutely perfectly.

I would like to post my XSLT stylesheet on a template exchange at
www.dnndev.com to be used in conjunction with the Dot Net Nuke add-on module
Xmod. There is a real need for this type of transform in this community.

Andrew, do you have a problem with this. I will make sure you have full
credit!

Kindest Regards,
Marney




On 3/6/08 5:43 PM, "Michael Kay" <mike@xxxxxxxxxxxx> wrote:

>> The characters that are effecting things are part of the
>> UNICODE set 'General Punctuation'. This is translating
>> through the stylesheet fine and is being displayed in the
>> resulting XML by &#146; (right hand quote) and &#150; (en
>> dash). Problem is, my dynamic website does not know how to
>> display these characters, and I am getting the little boxes instead.
> 
> It's not surprising that it doesn't know how to display them, since neither
> of these codepoints is assigned to any printable Unicode character. The
> Unicode codepoint for en dash is x2013; the code for "right single quotation
> mark" is x2019. 
> 
> What has happened is that your input uses the Microsoft-proprietary cp1252
> character encoding. There's no harm in that, provided that the software
> reading the file knows it's in this encoding, so that it can translate such
> characters to their proper Unicode values for use in the output XML.
>> 
>> I am thinking of integrating a Global Search and Replace
>> template that runs on the final XML to find all instances of
>> &#146; and replace with ' .
> 
> No, you should fix the problem at source rather than patching it up later.
> If you're reading the CSV file using unparsed-text(), and if the CSV file is
> in cp1252 encoding, then you can specify this in the encoding parameter to
> unparsed-text().
> 
> Michael Kay
> http://www.saxonica.com/
> 
> 

Marney Cotterill
graphic designer
                   
cracker//brandware

6 Bourke Street
Queens Park 
NSW 2022
Telephone 02 9387 2001
Facsimile 02 9387 2006
marney@xxxxxxxxxxxxxxxxxxxx
www.crackerbrandware.com

Current Thread