Re: [jats-list] validating NLM using python, any tips?

Subject: Re: [jats-list] validating NLM using python, any tips?
From: Rajagopal CV <cvr3@xxxxxxxxxxxxxxxx>
Date: Fri, 29 Nov 2013 13:21:18 +0530
Yet another method is to use an ant script.

<project basedir="./" default="parse" name="crossplatform.script">
  <property name="NLM-DTD-resources"  value="NLM-DTD-resources"/>
  <xmlcatalog id="nlm.dtds">
    <dtd publicId="-//NLM//DTD Journal Publishing DTD v3.0
20080202//EN" location="${basedir}/${NLM-DTD-resources}/dtd/NLM/publishing/journalpublishing3.dtd"/>
  </xmlcatalog>
  <target name="parse">
    <echo>Validating the ${input}</echo>
    <xmlvalidate failonerror="yes" warn="yes" file ="${input}">
      <xmlcatalog refid="nlm.dtds"/>
    </xmlvalidate>
  </target>
</project>

Save the above in file "build.xml" and keep the NLM DTD resources in a
folder named  "NLM-DTD-resources"

ant -buildfile build.xml -Dinput file.xml

OR

ant -Dinput file.xml


The only advantage is that this is cross-platform :-)

I too use xmllint on a linux box which is very handy.

--
Rajagopal

On Thu, Nov 28, 2013 at 10:55 PM, Alf Eaton <eaton.alf@xxxxxxxxx> wrote:
> From the command line you can use xmllint (brew install libxml2):
>
> xmllint --noout --loaddtd --valid file.xml
>
> On 28 November 2013 16:55, Ian Mulvany <i.mulvany@xxxxxxxxxxxxxxxxx> wrote:
>> Hi All,
>>
>> I'm building a small script in python to generate NLM XML. I would
>> like a companion script
>> to validate, preferably also in python.
>>
>> The generating script is a work in progress, but you can review it here:
>> https://github.com/elifesciences/elife-poa-xml-generation/blob/working/generate-poa-xml.py
>>
>> My initial attempt to get the NLM DTD for validation failed, the script here:
>> https://github.com/elifesciences/elife-poa-xml-generation/blob/working/validate.py
>>
>> returned
>> Traceback (most recent call last):
>>   File "validate.py", line 9, in <module>
>>     dtd = etree.DTD(StringIO(NLM_DTD))
>>   File "dtd.pxi", line 287, in lxml.etree.DTD.__init__
>> (src/lxml/lxml.etree.c:150450)
>>   File "dtd.pxi", line 394, in lxml.etree._parseDtdFromFilelike
>> (src/lxml/lxml.etree.c:152160)
>>
>> Has anyone done this in python, if so do you have code you could share?
>>
>> If I can't get it to work in python, should I consider an alternative
>> route, what would you suggest?
>>
>> I'm developing on a mac using OSX Mavericks, but I could also build on
>> a linux box.
>>
>>
>> - Ian
>>
>> ---
>> Head of Technology - eLife
>> Submit now - http://submit.elifesciences.org/
>> twitter: @IanMulvany

Current Thread