Re: [jats-list] convert PDF to JATS or BITS XML

Subject: Re: [jats-list] convert PDF to JATS or BITS XML
From: "Kevin Hawkins kevin.s.hawkins@xxxxxxxxxxxxxxxxxx" <jats-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Thu, 12 Jun 2014 17:30:40 -0000
Alex,

Now that I've googled the phrase "Jailbreaking the PDF", I see that the link I suggested in May:

https://web.archive.org/web/20130921075854/http://scholrev.org/hackathon

can now be found here:

http://pdfjailbreak.com/tools

Still, I'm confused about which "results were not encouraging". Are you speaking about GROBID in particular (for which I resurrected this thread), or all tools except Crocodoc? Is there a reason that Crocodoc is not listed at
http://pdfjailbreak.com/tools ?


--Kevin

On 6/11/14 8:39 AM, Alexander Garcia Castro alexgarciac@xxxxxxxxx wrote:
for academic papers, due to the heterogeneity in formats and ways to
produce the final pdf, the one tool that will give u a clean usable
output is crocodoc. I run jailbreaking the pdf, a workshop aiming to
get usable text from PDF. Here, by usable I mean clean, no mistakes,
with bold, italics, footnotes, bibliographic references, tables,
figures, etc ready to be used for whatever purpose. results were not
encouraging. crocodoc gives u HTML5, clean and reusable.

On Wed, May 7, 2014 at 7:52 AM, Wei Zhao w.zhao@xxxxxxxxxxx
<jats-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
Any body had experience to convert PDF to JATS or BITS XML? Any suggestions
for the conversion tools other than pdfx?

Thanks,

Wei

--
Wei Zhao
Metadata Librarian
OCUL/Scholars Portal
Phone: 416 946-0951
Fax: 416 978-1668
w.zhao@xxxxxxxxxxx

Current Thread