Re: [jats-list] Redaction markup in BITS 2.0

Subject: Re: [jats-list] Redaction markup in BITS 2.0
From: "Gareth Oakes goakes@xxxxxxx" <jats-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Mon, 4 May 2020 05:28:12 -0000
Hi Mike,

I didn't see any other responses yet so I'll dive in to what is a thorny area
__

Redaction marks are best represented if they are all of a standardised format
e.g. all black boxes over suppressed text or perhaps something similar to MS
Word change tracking where you have a defined scheme of underlines,
strikethroughs, colours, that has consistent meaning across the documentation
set.

The next thing to cover is to assess the range of possible redaction marks.
Across the documentation set the redactions may be more complex than at first
glance, for example additional margin notes, marks overlaid on images, special
marks for tables, and so on.

Once you have these cases all figured out and systematic you may find
situations where the redaction marks are at odds with the underlying XML
structure. For example where you are generating list labels (1,2,3; a,b,c;
etc.) and these same labels are somehow redacted.

My guidance would be to use milestone-start/end wherever possible and assign
each redaction mark a unique ID. For example:
	<milestone-start id="mark1"/>redacted text<milestone-end id="mark1"/>

Milestones work well for redactions that need to span over structural XML
elements. You are quite likely to hit a brick wall with the single container
element approach of <named-content>. You should try and tag all redactions
with the same markup scheme where possible. For edge cases you may need to
explore attributes or other mechanisms to hold the marks (e.g. to handle my
generated label example).

For publishing you may need to somehow normalise or adjust the markup as a
pre-publish transform in order to make it easy on your XSLT or publishing
tool. Especially if you are publishing to multiple output types, this
pre-publish layer will save duplication in your other downstream
transformation and publishing logic.

I hope that helps.

// Gareth Oakes
// Chief Architect, GPSL
// www.gpsl.co

o;?On 2/5/20, 01:29, "Mike Dean mdean@xxxxxxxxx"
<jats-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:

    Hi all,

    I have some content with redacted text and Ibm wondering whatbs the
    best way to represent redactions in BITS 2.0 XML. In the content, Ibve
    seen anything from single word redactions to large portions of
    sections redacted.

    Downstream, Ibll need to be able to represent the redacted content in
    an accessible way (for example, so that a screen readerbs reading
    order is correct and that it can note that some content has been
    redacted).

    Also, the redactions will need to be represented visually in a PDF,
    which will be generated from the XML. So, Ibd like to be able to note
    the approximate length of the redaction (maybe by character count?).
    Then the length in the output could appear about the same length as
    the actual content would have if it was included in the PDF.

    Webve considered using milestones to wrap redacted content, which
    could span multiple elements:
    <milestone-start content-type=bredactedb> b& <milestone-end>

    Or, we could note redacted content with a <named-content> element or
    an @specific-use attribute on an existing element.

    Any thoughts on the best way to mark this up in BITS?

    --
    Mike Dean
    Solution Architect
    Inera Inc.
    19 Flett Road, Belmont, MA 02478
    tel: 617-932-1932
    fax: 617-932-1458
    email: mdean@xxxxxxxxx
    web: www.inera.com | www.edifix.com  twitter: @eXtyles | @edifix

    ------------------------------------------------------
    This email message and any attachments are confidential. If you are
    not the intended recipient, please immediately reply to the sender or
    call 617-932-1932 and delete the message from your email system. Thank
    you.

Current Thread