Re: [xsl] Algorithm for splitting a string at a space

Subject: Re: [xsl] Algorithm for splitting a string at a space
From: "Graydon graydon@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Mon, 23 Nov 2015 19:48:05 -0000
On Mon, Nov 23, 2015 at 07:04:32PM -0000, Rick Quatro rick@xxxxxxxxxxxxxx scripsit:
> I have a series of strings that I need to split if they are longer than a
> particular length, say 30 characters. But I need to make the split at the
> previous space. Here is an example string:
> This is a long line that I want to split at a space.
> The 30th character is in the middle of a word, so I need to do the split at
> the previous space. I am using XSLT/XPath 2.0. I am having trouble
> developing a good algorithm for this. Any pointers would be appreciated.

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl=""; xmlns:xs=""; xmlns:xd="";
    exclude-result-prefixes="xs xd" version="2.0">
    <xsl:output method="text" />
    <xsl:variable name="input">I am a large string which needs to be broken at the last space on or before character thirty-one</xsl:variable>
    <xsl:template match="/">
        <xsl:variable name="cutLength" select="30" />
        <xsl:variable name="tokens" select="tokenize($input, '\p{Zs}')" />
        <!-- \p{Zs} because someone might have provided an unusual space -->
        <xsl:variable as="element(bucket)" name="candidates">
            <!-- we can't use one sequence for this and 2.0 hasn't got maps or arrays -->
                <xsl:for-each select="1 to count($tokens)">
                        <xsl:value-of select="string-join($tokens[position() le current()], '&#x0020;')" />
        <xsl:value-of select="$candidates/candidate[string-length() le $cutLength][last()]" />


"I am a large string which"

It's not as compact as the regexp solution from David Carlisle and it's asking a lot of the optimizer if it's a really, really big input line.  The pattern does generalize fairly well for making substrings from rules, rather than character positions.

-- Graydon

Current Thread