Re: [xsl] Looking for a cleaner way of auditing table cell data than this

Subject: Re: [xsl] Looking for a cleaner way of auditing table cell data than this
From: "Steven D. Majewski steve.majewski@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Thu, 9 Mar 2023 23:29:56 -0000
o;?If you have a substantial library of documents you want to report on,
I would suggest you use an XQuery database like BaseX or eXist that
indexes the documents of the work with your XPath selector.If I
understand your question, this should select tables with a td with
significant (i.e. non whitespace) text element and a child element on the
list. ( and you can make the list a variable ). 
//table/td[normalize-space(.)!=bb][*[lo cal-name() =  ( bparab,
bnoteb, bcnoteb , bcriticalb, bheadlineb, b& )  ]]

  On Aug 29, 2022, at 10:37 AM, Trevor Nicholls
  trevor@xxxxxxxxxxxxxxxxxx <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
  wrote:

Hi I have a substantial library of XML documents which include a great
number of tables. As it happens the content model for table cells is
promiscuous; a table cell may contain "block" data: <td> <para>blah
blah.</para></td> even to the extent of nested tables: <td>
<para>..</para> <table> <tb> .. </tb> </table><td> or, in the case of
very many simple tables, just simple text content: <td>Y</td><td>N</td> I
would like to identify cases where table cells have exploited the
promiscuous schema and mixed both text and block content, for example:
<td>For example:<para>This is a bad table cell.</para></td> I can't
construct the schema so that this is illegal while the earlier examples
are valid. At least I don't think I can. But I would like to identify
these cells (and correct them, but at the moment just reporting them is
sufficient). This is the XSL fragment I have come up with (using XSL 2),
but I imagine there is a much cleaner way of doing it and I might learn a
useful technique if I ask. <xsl:template name="mixed-cells">
<xsl:for-each select="//table"> <xsl:for-each
select="descendant::td[child::text()[normalize-space() != '']]"> <xsl:if
test="count(*[self::para | self::note | self::cnote | self::critical |
self::headline | self::error | self::define | self::qanda | self::inset |
self::ihead | self::steps | self::list | self::ol | self::inlist |
self::syntax| self::fragment | self::table]) &gt; 0"> <xsl:text>Table
cell with mixed content: </xsl:text> <xsl:call-template name="get-source"
/> <xsl:value-of select="$nl" /> <xsl:text> content=</xsl:text>
<xsl:value-of select="normalize-space(.)" /> <xsl:value-of select="$nl"
/> </xsl:if> </xsl:for-each> </xsl:for-each></xsl:template> The
normalize-space() in the third line is necessary because otherwise it
picks up newlines in a sequence of block children.The list of "block"
elements in the fourth line above is incomplete, and should probably be
sourced from a variable rather than given as a literal condition the way
I have done it here.The get-source template outputs the input document
name and current line number, and $nl is what you would expect it to be.
As it stands this template is going to report nested table cells multiple
times; there might be a clever fix for this but at the moment my focus is
on the best way to identify these troublesome cells in the first place.
cheersTXSL-List info and archiveEasyUnsubscribe (by email)
< /html> XSL-List info and archiveEasyUnsubscribe (by email)

Current Thread