Re: [xsl] Function converting RFC 2822 date to xsd:dateTime

Subject: Re: [xsl] Function converting RFC 2822 date to xsd:dateTime
From: "Michael Kay mike@xxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Mon, 8 Apr 2019 22:16:36 -0000
Are you aware that XPath 3.0 has the function parse-ietf-date() for this?

https://www.w3.org/TR/xpath-functions-31/#func-parse-ietf-date

The Saxon implementation is in Java; I haven't attempted an XPath
implementation. But you might find the spec (and the associated notes) is
useful in itself; and of course the QT3 test suite has test cases.

I don't know how date/times in RFC 2822 relate to all the other miscellaneous
RFCs referenced in the spec. Liam Quin did most of the research for this.

What are your requirements for handling invalid values?

Michael Kay
Saxonica

> On 8 Apr 2019, at 22:58, Martynas JuseviD
ius martynas@xxxxxxxxxxxxx
<xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>
> Hi,
>
> I have an XSLT 2.0 task where I'm parsing email Date headers defined
> in RFC 2822 and converting them to xsd:dateTime.
>
> Below is a function that converts between the two. I wanted to hear if
> there are improvements that could be made?
>
>    <xsl:function name="aex:rfc2822dateTime-to-dateTime" as="xs:dateTime">
>        <xsl:param name="date-time" as="xs:string"/> <!-- Tue, 9 Apr
> 2019 00:07:24 +1200 (NZST) -->
>        <xsl:variable name="months" select="'Jan', 'Feb', 'Mar',
> 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'"
> as="xs:string*"/>
>        <xsl:analyze-string select="$date-time"
>
regex="^(?:(Sun|Mon|Tue|Wed|Thu|Fri|Sat),\s+)?(0[1-9]|[1-2]?[0-9]|3[01])\s+(J
an|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)\s+(19[0-9]{{2}}|[2-9][0-9]{{3
}})\s+(2[0-3]|[0-1][0-9]):([0-5][0-9])(?::(60|[0-5][0-9]))?\s+([-\+][0-9]{{2}
}[0-5][0-9]|(?:UT|GMT|(?:E|C|M|P)(?:ST|DT)|[A-IK-Z]))(\s+|\(([^\(\)]+|\\\(|\\
\))*\))*$">
>            <xsl:matching-substring>
>                <xsl:sequence
> select="xs:dateTime(concat(format-number(xs:integer(regex-group(4)),
> '0001'), '-', format-number(index-of($months, regex-group(3)), '01'),
> '-', format-number(xs:integer(regex-group(2)), '01'), 'T',
> format-number(xs:integer(regex-group(5)), '01'), ':',
> format-number(xs:integer(regex-group(6)), '01'), ':',
> format-number(xs:integer(regex-group(7)), '01'),
> substring(regex-group(8), 1, 3), ':', substring(regex-group(8), 4,
> 2)))"/>
>            </xsl:matching-substring>
>            <xsl:non-matching-substring>
>                <xsl:message>Invalid RFC 2822 datetime: <xsl:value-of
> select="$date-time"/></xsl:message>
>            </xsl:non-matching-substring>
>        </xsl:analyze-string>
>    </xsl:function>
>
> The regex pattern is taken from
> https://stackoverflow.com/questions/9352003/rfc-2822-date-regex
>
> Martynas
> atomgraph.com

Current Thread