|
Subject: Re: [xsl] Split camel-case strings into words? From: "Chris Papademetrious christopher.papademetrious@xxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> Date: Wed, 10 May 2023 19:38:51 -0000 |
Hi Eliot,
A positive lookbehind is (?<=PATTERN) and a positive lookahead is (?=PATTERN).
I had to escape the "<" as "<". For the lookbehind pattern I used "\p{Ll}"
which matches any Unicode lowercase letter, and for the lookahead pattern I
similarly used "\p{Lu}" which matches any Unicode uppercase letter. Because
lookbehinds and lookaheads do not consume any content, they match the point
between the letters - but not the letters themselves - for determining where
to tokenize.
* Chris
From: Eliot Kimber eliot.kimber@xxxxxxxxxxxxxx
<xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Sent: Wednesday, May 10, 2023 3:33 PM
To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
Subject: Re: [xsl] Split camel-case strings into words?
This is the answer ChatGPT gave me:
In XQuery, you can use regular expressions and the tokenize() function to
split a camel case string into words. Here's an example query that does this:
bashCopy code
let $input := "MicrosoftExchangeOnline" let $words := tokenize($input,
"(?=[A-Z])") return $words
In this query, the tokenize() function splits the input string $input into
words using a regular expression that matches any position in the string where
the next character is an uppercase letter. The regular expression (?=[A-Z])
uses a positive lookahead to match the position before an uppercase letter
without actually consuming the letter itself. This ensures that the split
occurs at the correct boundaries.
The resulting sequence $words contains the individual words as separate
strings. In this case, the value of $words would be ("Microsoft", "Exchange",
"Online").
But its regular expression is wrong (but close to Chris' solution). I pasted
it into the BaseX query panel and it reported the regular expression as being
invalid, which it is (or rather, I trust BaseX to correctly report bad
regexs).
While I was waiting I reread the XSD spec's definition of regular expressions
and could not determine from that how to do what Chris showed.
I still don't know why Chris or Martin's regex works, but at least they
provide explainable solutions.
I'm glad to know there's still a role for humans here...
Cheers,
E.
_____________________________________________
Eliot Kimber
Sr Staff Content Engineer
O: 512 554 9368
M: 512 554 9368
servicenow.com<https://urldefense.com/v3/__https:/www.servicenow.com__;!!A4F2
R9G_pg!fsj8FBJEkp_-n-uuyxFJcW04AW3GaJpT2ItJY92X7st_oDm1FC517KalWeEi2yru_aK0VY
Q-BSoZp9HlK941bhPiGP6iQJD9DCpp4asiyPi-KtE4GiSB$>
LinkedIn<https://urldefense.com/v3/__https:/www.linkedin.com/company/servicen
ow__;!!A4F2R9G_pg!fsj8FBJEkp_-n-uuyxFJcW04AW3GaJpT2ItJY92X7st_oDm1FC517KalWeE
i2yru_aK0VYQ-BSoZp9HlK941bhPiGP6iQJD9DCpp4asiyPi-KlRz_tbr$> |
Twitter<https://urldefense.com/v3/__https:/twitter.com/servicenow__;!!A4F2R9G
_pg!fsj8FBJEkp_-n-uuyxFJcW04AW3GaJpT2ItJY92X7st_oDm1FC517KalWeEi2yru_aK0VYQ-B
SoZp9HlK941bhPiGP6iQJD9DCpp4asiyPi-KlT1gUT0$> |
YouTube<https://urldefense.com/v3/__https:/www.youtube.com/user/servicenowinc
__;!!A4F2R9G_pg!fsj8FBJEkp_-n-uuyxFJcW04AW3GaJpT2ItJY92X7st_oDm1FC517KalWeEi2
yru_aK0VYQ-BSoZp9HlK941bhPiGP6iQJD9DCpp4asiyPi-KpBnjVsP$> |
Facebook<https://urldefense.com/v3/__https:/www.facebook.com/servicenow__;!!A
4F2R9G_pg!fsj8FBJEkp_-n-uuyxFJcW04AW3GaJpT2ItJY92X7st_oDm1FC517KalWeEi2yru_aK
0VYQ-BSoZp9HlK941bhPiGP6iQJD9DCpp4asiyPi-KmBzmXnO$>
| Current Thread |
|---|
|
| <- Previous | Index | Next -> |
|---|---|---|
| Re: [xsl] Split camel-case strings , Eliot Kimber eliot.k | Thread | Re: [xsl] Split camel-case strings , Eliot Kimber eliot.k |
| Re: [xsl] Split camel-case strings , Eliot Kimber eliot.k | Date | Re: [xsl] Split camel-case strings , Eliot Kimber eliot.k |
| Month |