Subject: Re: [xsl] Running XSLT from Python From: "dvint dvint@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> Date: Sat, 18 Jan 2025 00:28:40 -0000 |
Tidy was failing because of some bad unknown elements. I had tried that early on. The html has to be modified though to get links to work properly. I also add the navigation file so the pdf has a toc.Sent from my Verizon, Samsung Galaxy smartphone -------- Original message --------From: "Peter Flynn peter@xxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> Date: 1/17/25 3:54 PM (GMT-08:00) To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx Subject: Re: [xsl] Running XSLT from Python On 17/01/2025 23:16, dvint@xxxxxxxxx wrote:> First off, is anyone aware of a good way to merge a bunch of HTML > techdoc pages into a single HTML so a PDF file can be generated with > something like Prince or Weasyprint?If I understand you right, you want to catenate the contents of the <body> elements and create a new HTML file. Assuming the HTML files share a common <head> (ie you only want it once):1. Make sure they are all well-formed XHTML/HTML5 (use Tidy)2. Copy the Document Type Declaration, the <html> start-tag, the whole of the <head> element, and the <body> start tag into your target.html3. For f in *.html; do lxgrep 'body/*' $f >>target.html; done4. Append </body></html> to target.htmllxgrep is part of the LTxml2 utilities from https://www.ltg.ed.ac.uk/software/ltxml2/Peter
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] Running XSLT from Python, Peter Flynn peter@xx | Thread | Re: [xsl] Running XSLT from Python, Peter Flynn peter@xx |
Re: [xsl] Running XSLT from Python, Peter Flynn peter@xx | Date | Re: [xsl] Running XSLT from Python, dvint dvint@xxxxxxxx |
Month |