Re: [xsl] XSLT (2) namespace safe i18n patterns

Subject: Re: [xsl] XSLT (2) namespace safe i18n patterns
From: ac <ac@xxxxxxxxxxxxx>
Date: Mon, 23 Nov 2009 09:39:43 -0500
Hi Wolfgang,

I thank you for your reply and realize that this may be a challenge.

We already have an xml:lang based translation dictionary system built into the product and we have reference tables to avoid repetition when possible. Putting everything in a translation dictionary is not feasible and we are looking for complementary practical options to provide more flexible i18n. For one thing, there is not limit to the size of data to be handled and it can come from an unbounded number of organizations. Flexibility is a requirement and uniformity a dream. It seems to us that adding attribute i18n, through namespaces can be naturally logical, flexible, appropriate for the task, and relatively simple. The only partly missing piece is namespace and/or prefix safety, at matching time, in the matching function, in XSLT. The solution could use something with prefix reference and/or indexing, and/or localization code namespaces, and/or something else, but forcing everyone to restructure their XML documents is not an option. Providing them with more flexibility by supporting attribute i18n is a requirement, taking nothing away from dictionaries and reference tables. The user knows his attributes and his translations and could provide those at will, as required. This does not prevent him from also proposing additions to the dictionary and reference tables but he requires more flexibility which we can easily provide him, with namespaces, if we can ensure prefix stability, at least for namespaces corresponding to localization codes, whether statically through XML (not available) or dynamically with XSLT (the question here). Somehow, I also doubt that we will be the only ones with similar requirements.

Before we question XML's ability to be the universal information language, can we still try to use XSLT to meet these "real world" requirements? Do you already think that it is impossible?

Thank you.

Cheers,
ac




The primary purpose of namespaces is to provide distinction between
the vocabulary of different application domains. Language can be
viewed as a facet of the presentation layer and should be treated
separately.

Having all translations within a document, presumably with many
repetitions, is not attractive. A translation service might very well
be defined on top of the essential XML structure, with XPath
expressions being used to identify the items that are subject to
translation, either content or attribute values. Another approach is
to add markup to the basic document (e.g.
http://wiki.zope.org/zope3/ZPTInternationalizationSupport).

Perhaps an XML centered user list is a better forum for this discussion.

-W


On 11/23/09, ac <ac@xxxxxxxxxxxxx> wrote:
Hi Syd,

Thank you for your quick response. Yes xml:lang was defined for i18n.
Aren't "natural" languages a reference model for namespaces?

I use xml:lang for translation dictionary applications, but it does not
seem to apply well here.  I was looking for something more optimal for
this context.  Restricting i18n to xml:lang seems to also mean that
(sizable) applications are either initially designed for i18n or not at
all, and that attributes cannot be internationalized.  I think and hope
that we can do better.

If you have 1M elements, for example with 2 different display attributes
each, the xml:lang approach would imply 2M more elements for each
supported language (e.g. 5 languages would mean 10M additional nodes, 10
times more), more elaborate processing (over, let's say 25K lines of
XSLT), changing document structure and content as well as most attribute
processing to element processing, a design "made for "i18n" (e.g.
translation dictionary) rather than a design optimized for the
application at stake.

Comparatively, managing namespaces with discipline and maintaining a
stable environment so that prefixes do not effectively change, does not
seem so bad, yet I still hope to do better.  Any help appreciated.

Thank you.

Cheers,
ac


My gut instinct is that it is a less than optimal solution to try to
use namespaces to differentiate natural languages. That's what
xml:lang= is for, after all.

  <z>
    <canonical>MD</canonical>
    <name xml:lang="en">medical doctor</name>
    <name xml:lang="fr">midecin</name>
    <name xml:lang="zh-TW">...</name>
  </z>


namespace, (XML) namespaces seemed designed to support localization
(e.g. i18n).  Namespace safety seems to damper that somewhat, and I am
looking for an optimal pattern.  Many list members here have worked
extensively with internationalization and namespaces, can anyone help me
find an optimal pattern to handle this:.

In a large XSLT2 project with lots of rich display vocabulary and
languages, we have (many different) elements that can include
display attributes like <z name="Displayed Name" .../>

To support i18n for those names, it seems natural to define
namespaces for each supported language, using the 2-letter
localization codes, as:

<global-element
    xmlns:fr="http://www.somedomain.com/fr";
    xmlns:en="http://www.somedomain.com/fr";
    xmlns:de="http://www.somedomain.com/fr";
    more-attributes=". . ."
 >

<!-- . . . -->

<!-- and creating corresponding attributes in the displayed elements,
like: -->
     <z name="MD" fr:name="Midecin" en:name="Medical Doctor"
more-attributes-and-content=". . ." />

<!-- . . . -->

<!-- as well as having other "context setting" elements that can define
locale, like: -->
<person lang="fr" more-attributes-and-content=". . ." />

<!-- and at display time, using the @lang attribute from the context
element (e.g. person) to match and select the "name" attribute from the
displayed element (e.g. z), in the proper namespace (e.g. fr), for
example.  Directly matching localization codes with namespace prefixes,
could provide great i18n flexibility and simplicity. -->

</global-element>

Localization codes are stable, but namespace prefixes may not be.
Changing prefixes can seriously break this scheme. What could/should be
the best way/pattern to manage this in a (namespace) safe way?

Current Thread