Re: [xsl] Running XSLT from Python

Subject: Re: [xsl] Running XSLT from Python
From: "Peter Flynn peter@xxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Fri, 17 Jan 2025 23:54:44 -0000
On 17/01/2025 23:16, dvint@xxxxxxxxx wrote:
First off, is anyone aware of a good way to merge a bunch of HTML techdoc pages into a single HTML so a PDF file can be generated with something like Prince or Weasyprint?

If I understand you right, you want to catenate the contents of the <body> elements and create a new HTML file. Assuming the HTML files share a common <head> (ie you only want it once):


1. Make sure they are all well-formed XHTML/HTML5 (use Tidy)

2. Copy the Document Type Declaration, the <html> start-tag, the whole of the <head> element, and the <body> start tag into your target.html

3. For f in *.html; do lxgrep 'body/*' $f >>target.html; done

4. Append </body></html> to target.html

lxgrep is part of the LTxml2 utilities from https://www.ltg.ed.ac.uk/software/ltxml2/

Peter

Current Thread