Re: [xsl] Encoding issues with document() function

Subject: Re: [xsl] Encoding issues with document() function
From: "Pankaj Bishnoi" <pankaj.bishnoi@xxxxxxxxxxx>
Date: Sat, 4 Nov 2006 20:04:20 +0530
Hi
     I am facing problems in removing 0xb,0xc,0xe,0xf. What will be the
representation for these characters in UTF-8. For 0x1 ia m using::
'\u0001' and it works fine. But the problem is with 0xb,0xc,0xe,0xf.

thanks
pankaj
----- Original Message ----- 
From: "Joe Fawcett" <joefawcett@xxxxxxxxxxx>
To: <xsl-list@xxxxxxxxxxxxxxxxxxxxxx>
Sent: Saturday, November 04, 2006 6:00 PM
Subject: Re: [xsl] Encoding issues with document() function


> It doesn't matter about the encoding. XML cannot have 0xb, 0xc, 0xe and
0xf
> in it.
> You can base64encode the data if it's part of an element's content before
> passing it to the XML parser, or replace the characters with allowed ones
> and then post process the data later to re-insert.
>
> Joe
>
>
> >From: "Pankaj Bishnoi" <pankaj.bishnoi@xxxxxxxxxxx>
> >Reply-To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> >To: <xsl-list@xxxxxxxxxxxxxxxxxxxxxx>
> >Subject: Re: [xsl] Encoding issues with document() function
> >Date: Sat, 4 Nov 2006 17:53:11 +0530
> >
> >Thanks for your help michael. Now i am replacing unicode characters.
> >
> >I have the encoding UTF-8 now::
> >
> >for 0x2 i can use replace('\u0002','')
> >
> >but for following characers what will be the replace character::
> >
> >0xa,0xb,0xc,0xd,0xe,0xf
> >
> >
> >Thanks
> >Pankaj
> >
> >----- Original Message -----
> >From: "Michael Kay" <mike@xxxxxxxxxxxx>
> >To: <xsl-list@xxxxxxxxxxxxxxxxxxxxxx>
> >Sent: Saturday, November 04, 2006 3:08 PM
> >Subject: RE: [xsl] Encoding issues with document() function
> >
> >
> > > If the document really does contain the Unicode character with
codepoint
> > > 0x02, then it's not a well-formed XML document, and you won't be able
to
> > > read it from XSLT or from anything else that's designed to process
XML.
> >You
> > > need  to correct the program that created the document so that it
> >outputs
> > > well-formed XML.
> > >
> > > The other possibility is that the document contains some other
character
> > > which is being misread as codepoint 0x02 because the parser is using
the
> > > wrong encoding, for example because the XML declaration is incorrect.
> > >
> > > Michael Kay
> > > http://www.saxonica.com/
> > >
> > > > -----Original Message-----
> > > > From: Pankaj Bishnoi [mailto:pankaj.bishnoi@xxxxxxxxxxx]
> > > > Sent: 04 November 2006 09:24
> > > > To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> > > > Subject: [xsl] Encoding issues with document() function
> > > >
> > > > Hi All
> > > >         I am having a xsl in which i use XSLT document()
> > > > function. The problem i am facing is that the xml file i am
> > > > trying to read by using
> > > > document() function is having some Unicode characters and the
> > > > exception thrown at transformation time is ::
> > > >
> > > > SystemId Unknown; Line #133;Column #104; Can not load
> > > > requested doc: An invalid XML character(Unicode: 0x2) was
> > > > found in the element content of the document
> > > >
> > > > The source xml file is having encoding UTF-8. I tried to
> > > > search the web for this issue and one alternate specified is
> > > > to replace thos '0x2' character.
> > > > Now there can be other characters as well that might come in
> > > > other scenarios such as 0x1,0x13 etc. Now my quesstion is is
> > > > there any encoding that supports all these characters?
> > > >
> > > > Is there any way out for this issue . Any help will be highly
> > > > appreciated.
> > > >
> > > > Thanks
> > > > Pankaj

Current Thread