(replying to the list, with Nikosb consent)
Hi Nikos,
Thanks for your comments. The changes in HoBoTS with respect to BITS
reflect some of the (rather minuscule) pain points that Hogrefe and I
had with BITS, as expressed in my message to this list on Jan. 16.:
http://www.biglist.com/lists/lists.mulberrytech.com/jats-list/archives/201301/msg00029.html
The changes particularly address the following points:
- allow nested tables (this is more a side effect of the following point)
- allow block-level content ('para-level-minus-x') in td, as an
alternative to inline content (I discussed this in the message cited above)
- support semi-generated ToCs (allow ToCs that only have a title and a
depth attribute, see the other message)
Besides that, we allow formatting (CSSa) and semantic (RDFa) markup in
attributes.
RDFa is intended for marking up multiple-choice tests and the like. Webd
prefer not to use JATS-style content-type attributes for that because
RDFa is more expressive and we can use the same vocabulary in HoBoTS and
in HTML (where webll probably have Javascript widgets that turn
RDFa-enriched books into interactive test applications).
We use CSSa for conveying the original InDesign style information and
local formatting overrides, after translating them into CSS properties
and attaching them as XML attributes. One reason for using CSSa is that
we want to be able to express formatting that should appear in the same
way in every rendering. For example, we can just pass thru the
properties for handwriting fonts b css:font-family="cursive", extra
spacing after paragraphs b css:margin-bottom="12pt", font color b
css:color="device-cmyk(0,1,1,0)", or table cell backgrounds b
css:background-color="device-cmyk(0,0,0,0.2)". They will find their way
to the HTML and EPUB renderings almost unaltered b except that theybll
translate into CSS rules or HTML style attributes, and except that color
values will be converted to RGB.
The good thing about CSSa is that you can start converting your typeset
or manuscript data to BITS quickly, without the need to define mappings
for every kind of formatting that may occur, and without the need to
define mappings from your content-type or style-type attributes to some
CSS for rendering. You just pass it thru.
You can later refine the conversion by recognizing some formatting as
semantically significant and then up-converting the XML that you already
have within the same schema (for example, styled-content with css
attributes b named-content without css attributes).
In most cases, a HoBoTS document can easily be transformed into a BITS
document:
If you have paragraphs in table cells, you can unwrap their contents and
place a break in between. You wonbt be able to preserve indentation and
vertical spacing, but this is acceptable.
If you have tables in table cells, you can wrap them in named-content.
If you have a toc like that (which is permitted in HoBoTS)
<toc depth="3">
<title-group>
<title>Inhaltsverzeichnis</title>
</title-group>
</toc>
you can render the headings to a full-blown BITS toc, or you can remove it.
You can remove all CSSa and RDFa attributes (maybe after mapping them to
appropriate *-type attributes).
So there will be some rather simple XSLT that will transform HoBoTS into
BITS, should the need arise. A high degree of BITS compatibility was one
of HoBoTSb design aims (without sacrificing compatibility with the
content structure of Hogrefebs books and with their initial strategy of
combining conventional typesetting with a sophisticated
checking/conversion infrastructure).
Let me finally make another remark regarding consistent naming and
structure. When writing XSLT that converts BITS to HTML, I found it
(unnecessarily?) that I frequently had to distinguish cases: Is it a
content division whose title is in a title group or just a plain title?
How many variants of title groups are there (book-title-group,
book-part-meta/title-group, b&). How many different body elements are
there (named-book-part-body, book-part/body, book-body)? How many type
attribute names are there (book-part-type, style-type, content-type, b&).
I think book-parts, prefaces, etc. may structurally be the same as
sections (sec may also carry metadata, alt-titles, etc.). I like the
DocBook 5 approach of allowing a metadata block with the uniform name
info on every document-structure element (and also on paragraphs). The
only thing that isnbt that straightforward in DocBook is that an
elementbs title is allowed either standalone or within an info block.
And people who develop XSLT conversions, who explore documents via XPath
or who select from a collection using XQuery will all benefit if the
number of *-type attribute names is radically reduced.
Why did the schema designers opt for this kind of redundancy? If youbre
in a styled-content element, there is only one permitted *-type
attribute. Why not call it type instead of style-type?
It just came to my mind that some of these naming and content model
decisions may be due to limitations of the original schema language
(DTD). But I think DTD allows more uniform metadata modeling and naming
than currently found in BITS.
After this final remark, another one regarding schema languages: I chose
RNG for extending, restricting, and redefining parts of the original
content model. This was particularly convenient when I dicovered that
there are no hooks for allowing RDFa and CSSa. I wrote an XSLT that
enhanced the attlists of all elements that previously were allowed to
carry xml:lang, abbr, or display-as attributes with something like this:
<define name="th-attlist" combine="interleave">
<ref name="css_attributes"/>
<ref name="Rdfa.attrib"/>
</define>
Of course I could have patched the DTD itself by inserting a placeholder
parameter entity for additional global attributes, or I could have gone
to the committee and have asked them to include such a global attributes
parameter entity in first place.
But one of the beauties of RNG is that I didnbt have to do string
processing of the DTD or some kind of DTDbXML transformation first. I
could use XSLT (and itbs simpler than patching XSD, btw). Another beauty
of RNG is that b apart from an automatic trang DTDbRNG conversion b I
didnbt have to touch the original schema in any way, even though it
lacked the extension hooks that I needed.
But what was intended as a brief message that mostly refers to my other
post has become quite lengthy b to the readers that bore with me until
here: thanks for your patience.
Gerrit
On 08.03.2013 18:07, Nikos Markantonatos wrote:
Hi Gerritt.
Thanks for this useful pointer. Have you considered submitting some or
even all of your suggested extensions to the BITS reviewing committee?
What is typically required is a brief description of each extension, the
reason that prompted you to adopt it, in what way it makes your book
encoding better and a description of what content or application may
benefit from each extension.
If you think you have a use case which others may benefit from, you
should probably suggest and it is possible that some of these extensions
may find themselves in one form or another in the BITS standard soon.
There is a reviewing process for such extensions suggested over the past
five months taking place later this month. This is a good opportunity to
contribute your extensions should you wish so.
Best regards,
Nikos Markantonatos
Atypon
On 03/08/2013 06:13 PM, Imsieke, Gerrit, le-tex wrote:
Dear List,
Webve developed a BITS customization for the Hogrefe group of
publishers. Hogrefe agreed that we make this customization publicly
available (the main ingredients are free and open anyway).
We converted BITS 0.2 to Relax NG and enriched it with RDFa and CSSa b
CSS as XML attributes, see
http://archive.xmlprague.cz/2013/presentations/Conveying_Layout_Information_with_CSSa/CSSa_xmlprague_gimsieke.html#/step-1
Therebs some documentation in the schema,
http://hobots.hogrefe.com/schema/hobots.rng (just view it in a browser).
Therebs a small sample document that somehow gives a hint as to how CSSa
comes into play: http://hobots.hogrefe.com/schema/hobots_sample.xml
You may open it in oXygen and should immediately see a validation error
against CSSa and against an embedded Schematron rule of hobots.rng.
The schema files and the sample files are included in a zip file,
http://hobots.hogrefe.com/schema/hobots.zip
We might eventually move the schema to a part of Hogrefebs svn repo that
is publicly readable, or move it to github.
Ibm looking forward to your feedback.
Gerrit
--
Gerrit Imsieke
GeschC$ftsfC<hrer / Managing Director
le-tex publishing services GmbH
Weissenfelser Str. 84, 04229 Leipzig, Germany
Phone +49 341 355356 110, Fax +49 341 355356 510
gerrit.imsieke@xxxxxxxxx, http://www.le-tex.de
Registergericht / Commercial Register: Amtsgericht Leipzig
Registernummer / Registration Number: HRB 24930
GeschC$ftsfC<hrer: Gerrit Imsieke, Svea Jelonek,
Thomas Schmidt, Dr. Reinhard VC6ckler