Subject: [xsl] An observation on the performance of fn:transform From: "Norman Tovey-Walsh ndw@xxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> Date: Fri, 3 Jul 2020 08:44:49 -0000 |
Hello world, This isnbt a complaint, or explicitly a request for advice (though Ibm always happy for helpful suggestions), just an observation. The workflow for processing DocBook documents is roughly this pipeline: 1. Fixup the logical structure of the document (expand entities and replace entityref attributes with the corresponding fileref attributes). 2. Perform XInclude 3. Convert DocBook 4.x markup to 5.x markup if the source document appears to be DocBook 4.x (i.e., if its root element is in no namespace) 4. Perform transclusion[1] 5. Profile 6. Resolve annotations 7. Resolve XLinks (including external link bases) These are all relatively small stylesheets and theybre currently run with fn:transform. (This will, as Ibve said before, all be driven by XProc in the medium term, but I have short term requirements.) The last two or three steps are: transform the result of step 7 from DocBook to HTML and then do a little cleanup on that output and, if bchunkingb has been requested, break it into chunks. Doing a little post-conversion cleanup improves the output and greatly simplifies the chunking tasks. Because Ibm old school, and because I initially had a bI canbt do this as a pipeline because I donbt have XProcb mindset, I wrote up the conversion to HTML, the cleanup, and the chunking as modes in the same stylesheet. Then this morning I thought, hang on, I could use fn:transform for those steps too and get all the benefits of pipelines there (easier to maintain, separately testable, etc.) So I coded that up. I now have an *eight* stage pipeline where the last stage does the transformation to HTML, cleanup of that HTML, and possible chunking. Itbs all still in one stylesheet with modes because I havenbt teased it apart yet, itbs just being run with fn:transform instead of with a mode in the same stylesheet. The performance difference is interesting. Running 1,426 tests through the 8 stage pipeline: 4m19s. Running 1,542 tests through the original 7 stages: 50s. There are fewer tests in the former case because some of my XSpec tests just canbt work against the new driver; Ibll have to run two sets of tests which is kind of a drag, but I should be running separate tests for all the stages anyway so I guess thatbs just the way it is. The performance difference is presumably because it takes ~0.15s to compile the main stylesheet each time. Which is, you know, pretty damned fast, but adds up if youbre going to do it thousands of times in a row. I donbt expect this to be an issue in real world use cases for the stylesheets, but I thought it was interesting. Ibm not surprised, but it wasnbt a consequence that had occurred to me before I started. Be seeing you, norm [1] https://docbook.org/docs/transclusion/transclusion.html -- Norman Tovey-Walsh <ndw@xxxxxxxxxx> https://nwalsh.com/ > I think it's much more interesting to live not knowing than have > answers which might be wrong.--Richard Feynman [demime 1.01d removed an attachment of type application/pgp-signature which had a name of signature.asc]
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] Highlight.js available as, Norman Tovey-Walsh n | Thread | Re: [xsl] An observation on the per, Martin Honnen martin |
Re: [xsl] Highlight.js available as, Norman Tovey-Walsh n | Date | Re: [xsl] An observation on the per, Martin Honnen martin |
Month |