Subject: Re: [jats-list] convert PDF to JATS or BITS XML From: "Alexander Garcia Castro alexgarciac@xxxxxxxxx" <jats-list-service@xxxxxxxxxxxxxxxxxxxxxx> Date: Thu, 12 Jun 2014 18:57:14 -0000 |
Hi Kevin, Crocodoc was not tested at the time of the pdfjailbreak event. I started to work with crocodoc in January. not encouraging means that for what I needed all the tools we tested some how felt short. I needed a perfect extraction of the layout as well as of the text -no mistakes. from all I have tried, crocodoc is the only one -although it is a comercial product, it is fairly easy to use for testing purposes. there are some issues with crocodoc, but so far so good. On Thu, Jun 12, 2014 at 12:30 PM, Kevin Hawkins kevin.s.hawkins@xxxxxxxxxxxxxxxxxx <jats-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote: > Alex, > > Now that I've googled the phrase "Jailbreaking the PDF", I see that the link > I suggested in May: > > https://web.archive.org/web/20130921075854/http://scholrev.org/hackathon > > can now be found here: > > http://pdfjailbreak.com/tools > > Still, I'm confused about which "results were not encouraging". Are you > speaking about GROBID in particular (for which I resurrected this thread), > or all tools except Crocodoc? Is there a reason that Crocodoc is not listed > at > http://pdfjailbreak.com/tools ? > > --Kevin > > > On 6/11/14 8:39 AM, Alexander Garcia Castro alexgarciac@xxxxxxxxx wrote: >> >> for academic papers, due to the heterogeneity in formats and ways to >> produce the final pdf, the one tool that will give u a clean usable >> output is crocodoc. I run jailbreaking the pdf, a workshop aiming to >> get usable text from PDF. Here, by usable I mean clean, no mistakes, >> with bold, italics, footnotes, bibliographic references, tables, >> figures, etc ready to be used for whatever purpose. results were not >> encouraging. crocodoc gives u HTML5, clean and reusable. >> >> On Wed, May 7, 2014 at 7:52 AM, Wei Zhao w.zhao@xxxxxxxxxxx >> <jats-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote: >>> >>> Any body had experience to convert PDF to JATS or BITS XML? Any >>> suggestions >>> for the conversion tools other than pdfx? >>> >>> Thanks, >>> >>> Wei >>> >>> -- >>> Wei Zhao >>> Metadata Librarian >>> OCUL/Scholars Portal >>> Phone: 416 946-0951 >>> Fax: 416 978-1668 >>> w.zhao@xxxxxxxxxxx >>> >> >> >> > -- Alexander Garcia http://www.alexandergarcia.name/ http://www.usefilm.com/photographer/75943.html http://www.linkedin.com/in/alexgarciac
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [jats-list] convert PDF to JATS, Kevin Hawkins kevin. | Thread | [jats-list] [ANN] Late-breaking New, B Tommie Usdin btusd |
Re: [jats-list] convert PDF to JATS, Kevin Hawkins kevin. | Date | [jats-list] [ANN] Late-breaking New, B Tommie Usdin btusd |
Month |