Subject: Re: [jats-list] JPub3 Preview Stylesheets generating invalid XHTML From: Wendell Piez <wapiez@xxxxxxxxxxxxxxx> Date: Fri, 7 Dec 2012 11:49:17 -0500 |
Hi Gerry, You're correct: my article wouldn't have given you the solution. Valid XHTML was never stipulated as a requirement for the preview stylesheets. These are preview stylesheets, so "validating in the application", i.e. opening and looking reasonable in the major browsers, was considered to be sufficient; we didn't have a use case requiring external validation of HTML outputs against an XHTML schema. (If the docs don't say this, they should. But who reads the documentation?) In part, this is because (to be honest) meeting this requirement would also have impeded other goals. We wanted the preview stylesheets (at least the basic standalone HTML stylesheet) to be in XSLT 1.0 so that it would work in the major browsers out of the box. Some requirements for valid XHTML -- in particular, that paragraphs not include divs or lists -- were not really achievable in XSLT 1.0 that was intended to be extended and maintained without guru-level XSLT skills. (JATS of course allows lists and some other block-level structures to appear inside paragraphs, while valid XHTML doesn't. So a simple mapping of element to element generates invalid results. The best XSLT 1.0 technique to get around this, sibling recursion, is something of a bear. :-) Anothing thing to bear in mind is that there could be a big difference between addressing this problem for a particular system (making it work for the data you have, then maintaining it for the new cases that come along), and addressing it in the general case (accounting comprehensively for all possible invalid XHTML outputs from the preview stylesheet). Without doing the analysis I can't assess how difficult the latter will actually be. All this having been said, the requirement is certainly important for many uses. If you're not happy with using Tidy (which is a good expediency but also introduces new variables -- after all, the requirement is actually "generate validate XHTML without butchering the data", not just validity as such), I'd suggest applying an XHTML remediation stylesheet in your pipeline, a post-process that will take the invalid XHTML emitted by the basic preview stylesheet and fix it. Use XSLT 2.0 to make the hard stuff more tractable. It would of course do things like get rid of the old-fashoned @name attributes (these were simply inherited from earlier versions of the stylesheet and would have been fixed, had valid XHTML been a design goal), split paragraphs around divs and lists, and so forth. (I'm actually curious as to what Tidy will do about the latter issue -- I guess I should try it and see.) I hope this helps. Feel free to write me off list for any discussion you don't feel would be welcome here. Best regards, Wendell On Thu, Dec 6, 2012 at 11:17 AM, Gerry King <g.king@xxxxxxxxxxxxxxxxxxxxxxxxxx> wrote: > <newbie alert /> I hadn't seen Wendell Piez's article on Fitting the Journal > Publishing 3.0 Preview Stylesheets to your needs > (http://www.ncbi.nlm.nih.gov/books/NBK47104/#piez-pipelining-methods) when I > started my descent into hell however I don't think it would have provided a > solution... > > Using the preview XSLT pipeline and then adding one last transform to create > the desired xhtml fragment seemed easy enough. I thought I had it working. > > Unfortunately I had forgotten about the warnings from TextMate+HTMLTidy and > spent the past week chasing my tail trying to work out why my XSL worked on > a sample XHTML that had been tidied but failed to generate the desired > output when I tried a batch <sigh> > > I was surprised that the output from jpub3-PMCcit-xhtml.xsl is invalid (my > original sample has 104 errors according to http://validator.w3.org/check). > The source of my woes are <a href>s that have name attributes but not id's; > my XSL uses these in key()s. > > <xsl:key name="figslist" match="//div[@class='fig panel']" use="concat('#', > a[1]/@id)"/> > <xsl:key name="tableslist" match="//div[starts-with(@class, 'table-wrap')]" > use="concat('#', a[1]/@id)"/> > > Fixing the problem in the jpub3-PMCcit-xhtml.xsl pipeline is daunting so I > guess I will use a Python script and pass the xhtml through HTMLTidy before > running my XSL for now. > > I am surprised nobody else has had issues with the xhtml output before > (searching this list before posting didn't find any hits). Are there any > plans to make the tool generate valid xhtml? > > Gerry King > Spandidos Publications > -- Wendell Piez | http://www.wendellpiez.com XML | XSLT | electronic publishing Eat Your Vegetables _____oo_________o_o___ooooo____ooooooo_^
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
[jats-list] JPub3 Preview Styleshee, Gerry King | Thread | [jats-list] JATS v1.1 request: Supp, Nikos Markantonatos |
[jats-list] JPub3 Preview Styleshee, Gerry King | Date | [jats-list] JATS v1.1 request: Supp, Nikos Markantonatos |
Month |