Subject: Re: [xsl] Correcting unbound namespace prefixes From: "G. Ken Holman" <gkholman@xxxxxxxxxxxxxxxxxxxx> Date: Mon, 02 Aug 2010 12:22:22 -0400 |
I'm not sure this is the correct place to post.
parser = make_parser() parser.setFeature( "http://xml.org/sax/features/namespaces", False ) parser.setContentHandler( handler ) parser.parse( fileIn )
This may be a question about JAXP, or simply about good standard operating procedure for bad input data.
I've got some XML that I know is invalid, but I'm not in a position to get the customer to fix it. Here's what it looks like:
<document> <text>Four score and twenty years ago..,</text> <pp:metadata publication-date="2010-07-31T12:30:00Z" /> ...
You get the idea (I hope): clearly someone began with XML in the "" namespace, extracted metadata in a post-processing step, and inserted the corresponding markup without adding the necessary namespace declarations or mapping "pp" to one. I don't know of a way to fix this through the JAXP API (i.e. interpolating the prefix mapping). Or am I better off just preprocessing this XML via Perl or Python before it's ever parsed?
Tony Nassar Ph.D. Palantir Technologies | Forward Deployed Engineer
-- XSLT/XQuery training: after http://XMLPrague.cz 2011-03-28/04-01 Vote for your XML training: http://www.CraneSoftwrights.com/s/i/ Crane Softwrights Ltd. http://www.CraneSoftwrights.com/s/ G. Ken Holman mailto:gkholman@xxxxxxxxxxxxxxxxxxxx Male Cancer Awareness Nov'07 http://www.CraneSoftwrights.com/s/bc Legal business disclaimers: http://www.CraneSoftwrights.com/legal
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] Correcting unbound namesp, Martin Honnen | Thread | Re: [xsl] Correcting unbound namesp, G. Ken Holman |
Re: [xsl] Correcting unbound namesp, Martin Honnen | Date | Re: [xsl] Correcting unbound namesp, G. Ken Holman |
Month |