[xsl] Is XML/XSL the right direction and if so where next.

Subject: [xsl] Is XML/XSL the right direction and if so where next.
From: Matt Gushee <mgushee@xxxxxxxxxxxxx>
Date: Wed, 14 Mar 2001 11:12:13 -0700 (MST)
Sara Christie writes:

 > Can I have an XML file which can be administered from MS Word (that
 > being the tool of choice of the administrator) but which is also used
 > with stylesheets (XSL) to display the content on the web. Any helpful
 > suggestions on how to achieve this and alternative methods would be
 > greatly appreciated. I am very keen to not re-invent the wheel nor to
 > attempt the impossible.

Do you have the resources to create a custom Visual Basic application?
If so, recent versions of Word (since 97, I guess) provide a VB object 
model that allows you to manipulate everything in a document. So you
could use this object model together with MSXML (be sure to get the
most recent version from msdn.microsoft.com/xml/) to:

 1) Output a "dumb" XML representation of the object tree
    (e.g., a VB Paragraph would become a <Paragraph></Paragraph>)

 2) Use XSLT to transform the above to "smart" XML (i.e., XML that
    describes the structure and semantics of your data).

Then of course you would need to be able to reverse the process to
output Word docs from XML.

I've tried this with a very simple PowerPoint presentation. It was
surprisingly easy. Word would be a bit more challenging, since Word
documents can include just about anything, and their high-level
structure is much less predictable than PowerPoint's.

There are a couple of problems with this approach:

* Is there a need to store formatting information?

  As you may have realized, the Microsoft HTML format that you didn't
  like is designed to do exactly that -- i.e., it is designed to
  produce a document that can be opened on anyone's desktop anywhere
  (or in Internet Explorer) and look exactly like it did to the person
  who created it.

  If your organization uses standardized templates and styles (or in
  the unlikely event that noone cares how the output documents are
  formatted), then you should be able to strip out all the formatting
  and store only the content.

  If formatting needs to be preserved on a per-document basis, then
  you're probably better off using Microsoft's HTML/XML output, ugly
  and convoluted though it is. You could run the resulting HTML files
  through HTML Tidy to make them into XHTML, then run an XSLT
  transform to create something more sensible from the XHTML.

* How do you deal with graphics, especially embedded bitmap graphics?

* How do you enforce the workflow:
      Word -> dumb XML -> smart XML -> dumb XML -> Word
  ... i.e., how do you ensure that all data is stored in smart XML,
  and that the dumb XML and Word documents -- especially the latter
  -- are transient, used only for user input and output?

Matt Gushee
Englewood, CO, USA

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread