Re: [xsl] constraining values with a pattern facet in relax ng?

Subject: Re: [xsl] constraining values with a pattern facet in relax ng?
From: Liam R E Quin <liam@xxxxxx>
Date: Sat, 09 Jul 2011 01:25:16 -0400
On Fri, 2011-07-08 at 11:26 -0400, Birnbaum, David J wrote:

> Is it even possible to express these constraints with a regex?
Yes... you can enumerate the set of possible values and use "or" between
them.  The expression will not be small.  You can optimise somewhat
using ranges.

For example, for single digit numbers, \d-\d, allowed valus are
  1-2, 1-3, ....
  2-3, 2-4, ....

so, (1-[2-9])|(2-[3-9])|(3-[4-9)|....(8-9)

For two digits, you get a copy of the above with a 1 in front, for the
same-leading digit case:
    10-1[1-9] | 11-1[2-9] ...
then for, e.g. 12-35,
    (1\d-[2-9]\d)
then for 13-109 you get
    (1\d-[1-9]\d\d)
and so on and so forth.

Better would be to use XSD 1.1 and two elements or attributes, and
generate the resulting span in XSLT.

> 1. I know the number of pages in the books in question and would like
> to specify maximum values, so that users couldn't try to enter a range
> like 456-98 for a 300-page book.
You could do this with XSLT or XQuery.

> 2. One set of entries consists of a three-volume series, so the page
> ranges are actually something like "II: 123-34", meaning "volume 2, pp
> 123-234". Each volume begins numbering the pages at 1 and I know the
> last page number for each volume. If I'm going to constrain the
> maximum page value, I'd like to do that in a way that is sensitive to
> the different lengths of each volume.
What about references to front matter? iii-xcv

Or to pre-foliation books, f.36r / f. 36v
> 
> 3. The three-volume series is a numbered set of texts, where, say
> volume 1 contains texts 1-84, volume 2 contains texts 85-147, and
> volume 3 contains texts 148-210. The xml contains the text number as
> well as the page range, and I'd like to constrain the page values to
> be credible. That is, I don't want a user to be able to try to assign
> pages for text #99 to volume 3 because I know that that text is in
> volume 2. (Or should I handle this by not having the user enter a
> volume number, and just inferring that myself from the text number?)

It depends on whether you want to catch errors (want redundancy) or
whether your users are perfect and don't make errors (then why bother
checking at all).

Given <vol>II</vol> <page>149</page>, it sounds to me like you're trying
to do too much with one schema.  Use XSLT or XQuery or COBOL on COGS or
some other mechanism to express some of these application-level
constraints.

Liam

-- 
Liam Quin - XML Activity Lead, W3C, http://www.w3.org/People/Quin/
Pictures from old books: http://fromoldbooks.org/

Current Thread