Re: [xsl] Re: Creating sequence/range text from position

Subject: Re: [xsl] Re: Creating sequence/range text from position
From: "Dimitre Novatchev dnovatchev@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Sun, 29 Jan 2023 02:47:35 -0000
Here is a pure XPath 3.1 (that means also XQuery) solution.

The bulk of the logic is in the function $compactInner, which is just 24
lines.

let $doc :=
<article>
    <p>Case 1 is <xref ref-type="bibr" rid="r2"/></p>
    <p>Case 2 is <xref ref-type="bibr" rid="r1 r2 r3 r5 r6 r8 r9 r10"/></p>
    <p>Case 3 is <xref ref-type="bibr" rid="r1 r2 r3 r4"/></p>
    <p>Case 4 is <xref ref-type="bibr" rid="r1 r3 r5 r7"/></p>
    <back>
        <ref-list>
            <ref id="r1">...</ref>
            <ref id="r2">...</ref>
            <ref id="r3">...</ref>
            <ref id="r4">...</ref>
            <ref id="r5">...</ref>
            <ref id="r6">...</ref>
            <ref id="r7">...</ref>
            <ref id="r8">...</ref>
            <ref id="r9">...</ref>
            <ref id="r10">...</ref>
        </ref-list>
    </back>
</article>,

$compactInner := function($ar as array(xs:integer), $self as function(*))
{
     let $arSize := array:size($ar),
         $arExists := exists($ar) and $arSize gt 0
      return
       if(not($arExists)) then ()
         else
           let $start := $ar(1),
             $next := (for $i in 2 to $arSize
                         return
                           if($ar($i) ge $start + $i)
                              then $i
                              else ()
                        )[1],
             $nextEnd := if(exists($next)) then $next -1 else $arSize,
             $upper := $ar($nextEnd)
             return ($start || (if($arSize gt 1 and $upper ne $start)
then'-' || $upper else ()),
                     if(exists($next))
                       then $self(array:subarray($ar, $next), $self)
                       else if($arSize gt $nextEnd)
                               then $self(array:subarray($ar, $nextEnd +1),
$self)
                               else ()
                     )
},

 $compact := function($ar as array(xs:integer))
 {
    $compactInner($ar, $compactInner)
 },
 $summary := ->($pIn as xs:string)
             {
               let $arr := array{ tokenize(replace($pIn, 'r', ''), ' ') !
xs:integer(.) }
                 return $compact($arr)
             }
 return
   ($summary($doc/p[1]/xref[1]/@rid),
    "------------",
    $summary($doc/p[2]/xref[1]/@rid),
    "------------",
    $summary($doc/p[3]/xref[1]/@rid),
    "------------",
    $summary($doc/p[4]/xref[1]/@rid)
)

When the above XPath expression is evaluated, the expected, correct results
are produced:

2
------------
1-3
5-6
8-10
------------
1-4
------------
1
3
5
7

Note: This compactor expresses any adjacent and increasing subsequence with
the "-" notation. If you need to use the "-" notation only for subsequences
with length at least 3, then this is left as an exercise for the reader :)

The same for data where the ref. numbers are not in increasing order (hint:
use the standard XPath 3.1 sort() function as a start)

Thanks,
Dimitre


On Sat, Jan 28, 2023 at 8:50 AM Charles O'Connor coconnor@xxxxxxxxxxxx <
xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:

> Thanks Vincent, and great to see you at NISO meetings!
>
>
>
> This works great! I will now spend a couple hours figuring out how you did
> it (or trying to), because thatbs how Ibve learned the little XSLT that
I
> know.
>
>
>
> Best,
>
> Charles
>
>
>
> *From:* Lizzi, Vincent vincent.lizzi@xxxxxxxxxxxxxxxxxxxx <
> xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
> *Sent:* Saturday, January 28, 2023 12:50 AM
> *To:* xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> *Subject:* [xsl] Re: Creating sequence/range text from position
>
>
>
> *** External email: use caution ***
>
>
>
> Hi Charles,
>
>
>
> Here is a slightly improved version that produces the expected output from
> your sample input and also copes with a case of the idbs listed in @rid
not
> being in sequential order.
>
>
>
> <?xml version="1.0" encoding="UTF-8"?>
>
> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
>
>     xmlns:xs="http://www.w3.org/2001/XMLSchema";
>
>     exclude-result-prefixes="xs"
>
>     version="3.0">
>
>
>
>     <xsl:mode on-no-match="shallow-copy"/>
>
>
>
>     <xsl:key name="element-by-id" match="*[@id]" use="@id"/>
>
>
>
>     <xsl:template name="label" as="element(label)">
>
>         <xsl:param name="id" as="xs:string"/>
>
>         <xsl:param name="context" as="node()" select="root(.)"/>
>
>         <label><xsl:value-of select="1 + count(key('element-by-id', $id,
> $context)/preceding::ref)"></xsl:value-of></label>
>
>     </xsl:template>
>
>
>
>     <xsl:template match="ref">
>
>         <xsl:copy>
>
>             <xsl:apply-templates select="@*"/>
>
>             <xsl:call-template name="label">
>
>                 <xsl:with-param name="id" select="@id"/>
>
>             </xsl:call-template>
>
>             <xsl:apply-templates select="node()"/>
>
>         </xsl:copy>
>
>     </xsl:template>
>
>
>
>     <xsl:template match="xref[not(node())]">
>
>         <xsl:copy>
>
>             <xsl:apply-templates select="@*"/>
>
>             <xsl:variable name="context" select="root(.)"/>
>
>             <xsl:variable name="labels" as="element(label)*">
>
>                 <xsl:for-each select="tokenize(@rid)">
>
>                     <xsl:call-template name="label">
>
>                         <xsl:with-param name="id" select="."/>
>
>                         <xsl:with-param name="context" select="$context"/>
>
>                     </xsl:call-template>
>
>                 </xsl:for-each>
>
>             </xsl:variable>
>
>             <xsl:variable name="sorted" as="element(labels)">
>
>                 <labels>
>
>                     <xsl:for-each select="$labels">
>
>                         <xsl:sort select="number()"/>
>
>                         <xsl:sequence select="."/>
>
>                     </xsl:for-each>
>
>                 </labels>
>
>             </xsl:variable>
>
>             <xsl:variable name="formatted" as="xs:string*">
>
>                 <xsl:for-each select="$sorted/label">
>
>                     <xsl:choose>
>
>                         <xsl:when test="not(preceding::label)">
>
>                             <xsl:sequence select="string()"/>
>
>                         </xsl:when>
>
>                         <xsl:when test="number() - 1 eq
> number(preceding::label[1])
>
>                             and number() + 1 eq
> number(following::label[1])"/>
>
>                         <xsl:when test="number() - 1 eq
> number(preceding::label[1])
>
>                             and number() - 2 eq
> number(preceding::label[2])">
>
>                             <xsl:sequence select="'-' || string()"/>
>
>                         </xsl:when>
>
>                         <xsl:when test="preceding::label">
>
>                             <xsl:sequence select="',' || string()"/>
>
>                         </xsl:when>
>
>                     </xsl:choose>
>
>                 </xsl:for-each>
>
>             </xsl:variable>
>
>             <xsl:value-of select="string-join($formatted, '')"/>
>
>         </xsl:copy>
>
>     </xsl:template>
>
> </xsl:stylesheet>
>
>
>
> Best wishes,
>
> Vincent
>
>
>
> _____________________________________________
>
> *Vincent M. Lizzi*
>
> Head of Information Standards | Taylor & Francis Group
>
> vincent.lizzi@xxxxxxxxxxxxxxxxxxxx
>
>
>
>
>
>
>
> Information Classification: General
>
> *From:* Charles O'Connor coconnor@xxxxxxxxxxxx <
> xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
> *Sent:* Friday, January 27, 2023 7:05 PM
> *To:* xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> *Subject:* [xsl] Creating sequence/range text from position
>
>
>
> Hi all,
>
> We use a word processor-like XML editor that presents users with generated
> text for numbered bibliographic references and their in-text citations.
> However, downstream systems require actual text to be placed. Question is,
> how do we get it?
>
> (Using SaxonEE 10.x)
>
> Given
>
> <article>
> <p>Case 1 is <xref ref-type="bibr" rid="r2"/></p>
> <p>Case 2 is <xref ref-type="bibr" rid="r1 r2 r3 r5 r6 r8 r9 r10"/></p>
> <back>
> <ref-list>
> <ref id="r1">...</ref>
> <ref id="r2">...</ref>
> <ref id="r3">...</ref>
> <ref id="r4">...</ref>
> <ref id="r5">...</ref>
> <ref id="r6">...</ref>
> <ref id="r7">...</ref>
> <ref id="r8">...</ref>
> <ref id="r9">...</ref>
> <ref id="r10">...</ref>
> </ref-list>
> </back>
> </article>
>
> We want
>
> <article>
> <p>Case 1 is <xref ref-type="bibr" rid="r2">2</xref></p>
> <p>Case 2 is <xref ref-type="bibr" rid="r1 r2 r3 r5 r6 r8 r9
> r10">1-3,5,6,8-10</xref></p>
> <back>
> <ref-list>
> <ref id="r1">...</ref>
> <ref id="r2">...</ref>
> <ref id="r3">...</ref>
> <ref id="r4">...</ref>
> <ref id="r5">...</ref>
> <ref id="r6">...</ref>
> <ref id="r7">...</ref>
> <ref id="r8">...</ref>
> <ref id="r9">...</ref>
> <ref id="r10">...</ref>
> </ref-list>
> </back>
> </article>
>
> (only the xrefs have changed)
>
> Case 1 is pretty easily handled with:
>
> <xsl:template match="xref[@ref-type='bibr']">
> <xsl:variable name="rid" select="@rid" />
> <xsl:copy>
> <xsl:copy-of select="@*" />
> <xsl:value-of
> select="count(/article/back/ref-list/ref[@id=$rid]/preceding-sibling::ref)
> + 1" />
> </xsl:copy>
> </xsl:template>
>
> But Case 2 is not even close. I tried adapting the solution here, to no
> avail:
>
https://stackoverflow.com/questions/47559712/xslt-sequence-of-numbers-to-rang
e
>
> Whatever I did, I just got the first number and none of the rest. The
> extra twist here is that numbers should only be expressed as a range when 3
> or more occur sequentially.
>
> Thanks,
> Charles
>
> Charles O'Connor l Lead Product Manager
> Pronouns: He/Him
> Aries Systems Corporation l www.ariessys.com
> 50 High Street, Suite 21 l North Andover, MA l 01845 l USA
>
>
> Main: +1 (978) 975-7570
> Cell: +1 (802) 585-5655
>
>
>
> XSL-List info and archive <http://www.mulberrytech.com/xsl/xsl-list>
>
> EasyUnsubscribe <http://lists.mulberrytech.com/unsub/xsl-list/2963104> (by
> email)
> XSL-List info and archive <http://www.mulberrytech.com/xsl/xsl-list>
> EasyUnsubscribe <http://lists.mulberrytech.com/unsub/xsl-list/782854> (by
> email <>)

Current Thread