Re: [xsl] citation processing

At 11:32 AM 10/20/2006, Andrew wrote:

If you think its not really feasible to parse a plain text citation
into a marked up version then that's good feedback - it could well be
that a percentage need to be done by hand.

Scale is a real issue here. Real-world citation formats include variations like "use 'pp.' on page ranges for articles in books, but not for articles in journals." At scale, even if your process does the correct thing with 85 of 100 citations (a very optimistic rate), that can leave scores of incorrect ones. And if your upconversion can't recognize where it's failing, you have to find the errors before you can fix them.

David is right: it's ultimately an NLP problem (though a very interesting subset of NLP). As he also says, success depends both on handling the rules properly, and on the input actually following those rules. (There are dozens of citation formats around, too.) "Never say never" is good to keep in mind, but when I'm asked to look at citations I immediately start asking questions about the scope of the input, its validation, and acceptable strategies for exception handling. When told there won't be any exceptions it's usually pretty easy to find a bunch.

Cheers,
Wendell

Current Thread
Re: [xsl] citation processing, (continued) Andrew Welch - Fri, 20 Oct 2006 16:32:00 +0100 Michael Kay - Fri, 20 Oct 2006 16:46:12 +0100 David Carlisle - Fri, 20 Oct 2006 16:55:05 +0100 Andrew Welch - Fri, 20 Oct 2006 17:07:54 +0100 Message not available Wendell Piez - Fri, 20 Oct 2006 12:37:16 -0400 <= Wendell Piez - Fri, 20 Oct 2006 12:32:28 -0400 David Carlisle - Fri, 20 Oct 2006 22:20:21 +0100 Wendell Piez - Mon, 23 Oct 2006 11:19:43 -0400 Waters, Michael, Springer US - Fri, 20 Oct 2006 12:10:48 -0400

Current Thread

Re: [xsl] citation processing, (continued)
- Waters, Michael, Springer US - Fri, 20 Oct 2006 12:10:48 -0400

<- Previous	Index	Next ->
Re: [xsl] citation processing, Andrew Welch	Thread	Re: [xsl] citation processing, Wendell Piez
Re: [xsl] citation processing, Wendell Piez	Date	RE: [xsl] citation processing, Wendell Piez
	Month

<-prev [Thread] next->	<-prev [Date] next->
Month Index \| List Home