RE: [xsl] Converting delimited text WITH <br> to string

Subject: RE: [xsl] Converting delimited text WITH <br> to string
From: <Jarno.Elovirta@xxxxxxxxx>
Date: Tue, 22 Jun 2004 08:06:05 +0300
Hi,

> Friends,
> I guess I missed the answer to this one. I have read a lot of FAQs,
> but I have not found my particular answer.
> 
> All I want to do is to compare an XML file with a text file.
> 
> My desire is to convert the text file into a string then compare the
> data in it to the XML nodes. However, the text file always gets
> parsing errors.

Bacause you're trying to parse something that's not XML with an XML parser.

> The text file has is exported from a OLD database, but the fields do
> have <br> and other sloppy html in them.

Then you have to clean it first by removing the HTML tags, or by converting the "document" into XML (XMLized HTML or XHTML).
 
> I would edit them, but there are over 300 of them all in different
> folders (lucky for me they are on the same server).
> 
> Here is the URL.
> 
> http://lcweb2.loc.gov/music/ftp/951201/06180001/ftscript.data

Exactly what do you want to compare, and what does the XML you want to compare with look like.

> <?xml version="1.0" encoding="utf-8"?>
> <!DOCTYPE xsl:stylesheet [ 
> <!ENTITY lll SYSTEM
> "http://lcweb2.loc.gov/music/ftp/951201/06180001/ftscript.data";>
> <!ENTITY nbsp "&#x20;">
> <!ELEMENT br (EMPTY)>
> <!ELEMENT BR (EMPTY)>

Declaring the elements will not help you with the parsing errors, because the file is not XML.

> ]>
> 
> <xsl:stylesheet 
> 	version="1.0" 
> 	xmlns:xsl="http://www.w3.org/1999/XSL/Transform"; 
> 	xmlns:xs="http://www.w3.org/2001/XMLSchema"; 
> 	xmlns:html="http://www.w3.org/1999/xhtml"; 
> 	exclude-result-prefixes="html xs" 
>   xmlns:saxon="http://icl.com/saxon";
>   extension-element-prefixes="saxon"
> >
> <xsl:output 
> 	version="1.0" 
> 	method="html" 
> 	indent="yes" 
> 	encoding="utf-8" 
> 	omit-xml-declaration="no" 
> 	standalone="no" 
> 	media-type="text" 
> 	cdata-section-elements="br"
> />
> 
> <xsl:template match="/">
> <X>
> <xsl:copy-of
> select="document('http://lcweb2.loc.gov/music/ftp/951201/06180
001/ftscript.data')//*"/>
> <xsl:apply-templates
> select="document('http://lcweb2.loc.gov/music/ftp/951201/06180
001/ftscript.data')//text()
> | *"/>
> <xsl:copy-of
> select="document('http://lcweb2.loc.gov/music/ftp/951201/06180
001/ftscript.data')//text()"/>

What should the above do? Or rather, what do you want the above to do?

> <xsl:text>&lll;</xsl:text>
> </X>
> </xsl:template>
> 
> </xsl:stylesheet>

Cheers,

Jarno - Cubanate: Transit

Current Thread