[xsl] shuffling words in text content

Subject: [xsl] shuffling words in text content
From: "Chris Papademetrious christopher.papademetrious@xxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Tue, 7 Sep 2021 19:20:00 -0000
Hi everyone,

I recently needed to write a transformation to shuffle words in text content,
but still keep the overall element structure intact. For example, I might want
to transform

<p>Hey, here is some text!</p>

into

<p>is, text Hey some here!</p>

I didn't see anything exactly like this in the list archives or in
StackOverflow, so I thought I'd share what I came up with:


<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl=http://www.w3.org/1999/XSL/Transform
	xmlns:xs=http://www.w3.org/2001/XMLSchema
	exclude-result-prefixes="#all"
	version="2.0">
  <xsl:output indent="yes"/>


  <!-- regex that defines what a "word" is -->
  <xsl:param name="word-pattern" select="'(\w+)'"/>


  <!-- identity transformation -->
  <xsl:template match="@*|node()">
    <xsl:copy>
      <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
  </xsl:template>


  <!-- shuffle words in each text() element -->
  <xsl:template match="text()[not(ancestor::pre)]">
    <!-- get the list of words in this block of text -->
    <xsl:variable name="words" as="node()*">
      <xsl:analyze-string select="." regex="{$word-pattern}">
        <xsl:matching-substring>
          <word><xsl:value-of select="."/></word>
        </xsl:matching-substring>
      </xsl:analyze-string>
    </xsl:variable>

    <!-- perturb the word order -->
    <xsl:variable name="shuffled-words" as="xs:string*">
      <xsl:call-template name="pick-random-item">
        <xsl:with-param name="items" select="$words"/>
      </xsl:call-template>
    </xsl:variable>

    <!-- reform the string with the reordered words-->
    <xsl:analyze-string select="." regex="{$word-pattern}">
      <xsl:matching-substring>
        <xsl:variable name="this-position" select="position()"/>
        <xsl:value-of select="$shuffled-words[floor(($this-position + 1) div
2)]"/>
      </xsl:matching-substring>
      <xsl:non-matching-substring>
        <xsl:value-of select="."/>
      </xsl:non-matching-substring>
    </xsl:analyze-string>
  </xsl:template>


  <!-- XSLT item shuffler, borrowed from
       https://stackoverflow.com/questions/21953336/randomize-node-order-xslt
-->
  <xsl:param name="initial-seed" select="123"/>
  <xsl:template name="pick-random-item">
    <xsl:param name="items" />
    <xsl:param name="seed" select="$initial-seed"/>
    <xsl:if test="$items">
      <!-- generate a random number using the "linear congruential generator"
algorithm -->
      <xsl:variable name="a" select="1664525"/>
      <xsl:variable name="c" select="1013904223"/>
      <xsl:variable name="m" select="4294967296"/>
      <xsl:variable name="random" select="($a * $seed + $c) mod $m"/>
      <!-- scale random to integer 1..n -->
      <xsl:variable name="i" select="floor($random div $m * count($items)) +
1"/>
      <!-- write out the corresponding item -->
      <xsl:copy-of select="$items[$i]"/>
      <!-- recursive call with the remaining items -->
      <xsl:call-template name="pick-random-item">
        <xsl:with-param name="items" select="$items[position()!=$i]"/>
        <xsl:with-param name="seed" select="$random"/>
      </xsl:call-template>
    </xsl:if>
  </xsl:template>

</xsl:stylesheet>


Link to XSLT Fiddle here:
https://xsltfiddle.liberty-development.net/nbiE1aZ/1

The approach is:

1. Call <xsl:analyze-string> to extract the words from a text() element.
2. Call a template that shuffles the words.
3. Call <xsl:analyze-string> (again) to substitute the shuffled words in place
of the original words.

Hopefully this is helpful if someone needs to solve a similar problem in the
future!

-----
Chris Papademetrious
Tech Writer, Synopsys, Inc.

Current Thread