Dear Michael,
You are pointing in a direction that Ibd also recommend: XML catalogs to
map the schema locations to local files and then an XProc pipeline to
validate against the corresponding schema and to transform the documents
into your format. If the renderer from your format to, say, XHTML is
also XSLT-based, you could include it in the pipeline.
XML catalogs by itself wonbt help you in mapping namespaces to schema
locations. They rather map (public, canonical, vendor-defined, b&) URLs
to where these schema documents reside on your hard disk.
Associating a schema to a document by an elementbs namespace is more in
the domain of NVDL. You can use vendor-specific extensions [1] to
perform NVDL validation in XProc. Alternatively, you can maintain a
lookup table (or a lookup XML document) yourself and use XSLT to read
your document, identify the top-level elementbs namespace URI and load
the appropriate XML schema. Documents with an XSD schema will often
tell you their schema location in an @xsi:schemaLocation attribute so
that you might skip the lookup for these documents.
Of course tasks like these might be solved with other technologies, such
as custom Java programs, shell scripts, etc. But XProc is a nice
orchestrating technology to keep everything maintainable, as XPath/XSLT
2 is a nice technology to perform the lookups.
Although you mention XSLT in your post and XSLT might be part of the
solution, your question at its core might be slightly off-topic for this
list, and neither you nor I would want to raise Tommie Usdinbs ire for
discussing off-topic issues here. [2]
So Ibm suggesting that we take this discussion off-list, look at some of
your input documents and at your neutral format and sketch a solution
(that might or might not include XProc, NVDL, XSLT 2 b but probably will!).
Ibm not trying to sell our services through this list, but I think that
we might actually help you with setting up the XProc/XSLT part.
Your developers can invoke Calabash or Saxon from your Java server apps.
Btw, Norman Walsh is currently writing a RESTful Calabash service [3],
and therebs also Florent Georgesb Servlex to turn the lookup/conversion
tasks into Web apps.
And Ibm confident that if you go the XProc route, your Java programmersb
expertise will not be lost. They may for example become prolific
Calabash extension programmers besides integrating XProc into the server
apps.
Gerrit
[1] http://xmlcalabash.com/docs/reference/pxp-nvdl.html
[2] http://www.mhonarc.org/archive/html/xsl-list/2008-04/msg00478.html
[3] https://twitter.com/ndw/status/394668472741146625
On 28.10.2013 19:49, Michael SchC$fer wrote:
Dear all,
We (public statistics) are operating a generic XML-based data
collection process. Now we are starting to receive namespaced XML
documents from a number of authorities. Those documents must be
validated and then transformed into our generic format. In order
to achieve this, we would like to set up a configurable system
that looks up the respective XML Schema and XSLT stylesheet by
namespace, and performs both validation and transform.
Given that we have seasoned Java programmers with JAXP experience,
we are thinking in the direction of a Java server app, but this
could be a case for trying out something different such as XProc
and extending our tool set. Also, we're wondering if XML catalogs
could be useful.
We'd be very grateful for any suggestion and helpful information.
Regards,
Michael
--
Gerrit Imsieke
GeschC$ftsfC<hrer / Managing Director
le-tex publishing services GmbH
Weissenfelser Str. 84, 04229 Leipzig, Germany
Phone +49 341 355356 110, Fax +49 341 355356 510
gerrit.imsieke@xxxxxxxxx, http://www.le-tex.de
Registergericht / Commercial Register: Amtsgericht Leipzig
Registernummer / Registration Number: HRB 24930
GeschC$ftsfC<hrer: Gerrit Imsieke, Svea Jelonek,
Thomas Schmidt, Dr. Reinhard VC6ckler