[xsl] About encoding - or something I relate to it ...

Subject: [xsl] About encoding - or something I relate to it ...
From: "Karl Koch" <TheRanger@xxxxxxx>
Date: Thu, 21 Jul 2005 15:32:48 +0200 (MEST)
Hello group,

I have a content set about books which I curretnly transfrom. I have solved
most of the questions here with all your kind help. Also I have to clean
some of the text from weired characters that appears to have some pattern. I
am not sure, but perhaps I show you some examples and some of you could know
what that could be or if this is some kind of encoding problem. Perhaps
there is something I can do in general to fix the entire collection by
changing the encoding of the data (I am not an expert in that therefore
please take me applogies if I talk rubbish here). 

Effect 1: There are "q" letters in front of sentences (e.g. "...it has to
take place in the Old West. qAnd make sure that ...". I have this effect
uncontrolled at thousends of places all over the collection. I am not sure
if this is an encoding problem. Perhaps we can discuss here. However I think
I could deal with it by recognising it with XSLT and deleting this character
(after recognising it with a regular expression, of course). The regular
expression should have the following structure. First a dot followed by a
space (=> to recognise the beginning of a sentence), then a q (not capital
letter) and then any of the 26 capital letters. How can I do that in XSLT?

Effect 2: I have questionmarks at places where I would expect to find a
special character like ' or # or $. Example: "...which was the Company???s
first drama production in ...". Second example: "??6 to ??8" in a price tag
where it should be "#6 to #8" or "$6 to $8".

What do you think about that?

Best Regards,
Karl

-- 
5 GB Mailbox, 50 FreeSMS http://www.gmx.net/de/go/promail
+++ GMX - die erste Adresse fo?=r Mail, Message, More +++

Current Thread