Omprakash,
Over on XML-Dev an entirely different thread is coming (again -- it's a
permathread), which bears on the same issue you asked about.
The context is XML binarization. Steve DeRose (whose name you will find on
the XPath 1.0 Rec) writes:
I think most people would consider a format "lossless" if you could export
from it back into XML syntax, and when you parsed the resulting XML you
got the same DOM as for the original document. If that's enough, it's not
hard to make a lossless binary format (and mine was lossless, except I
think it discarded comments and PIs). HOWEVER, this is not completely
lossless. You still lose (among other things):
* the entity structure
* being able to get a matching DSIG
* all sorts of really ugly whitespace normalization details (including
within tags)
* single- versus double-quoting of attributes
* namespace prefix usage
* order of attributes
* <br /> vesus <br></br>
So, until you define "lossless", there's no point in comparing whether two
products are lossless or not.
These are all aspects you may find problematic in developing an XML editor.
(Well, some of them you really might not think about.) I should think it
would be quite a trick to determine how close to the code the user should
be, in any given case. This is one reason why we see several different
varieties of XML editors in the market, ranging from tree editors through
all the way to souped-up plain text editors -- they all take different
approaches to this problem.
An XSLT transform will have "lossiness" issues with all of the above, and
most XML editors I know of use XSLT, if at all, only to go one way (from
source to presentation) not back upstream.
On line, in the web environment, it looks like the wiki approach (deploy a
lightweight, task-specific non-XML markup language and parse that in back)
is as favored as any these days. XSLT (especially armed with 2.0 regexps)
might actually do well in that scenario. But it assumes the user will allow
a significant gap between WYS (what you see) and WYG (what you get).
Allowing users to edit HTML and then converting that back into XML -- I
think this would only be possible in theory if:
* you enhance HTML in order to capture info you need (your enhancements
might be disguised as 'class' attributes and what not)
* you have serious rules set up constraining what this enhanced HTML is
(both less and more than arbitrary HTML)
* you have some way of validating to those rules
* you have some way to represent, communicate, and teach those rules and
encourage/help/force users to conform to them
As you can see these are daunting tasks, to say nothing of their impact on
the design and maintenance of the XML format in back. (If your writers like
HTML, one might be tempted to go with valid XHTML, perhaps with some local
semantic conventions, and try to live with that.)
Another approach I'm a bit surprised not to see more of is the pure Java
editor set up in quasi-'WYSIWYG' fashion for a single tag set -- a
dedicated Docbook or mini-Docbook editor, say, or TEI ultra-lite. We may
see more of this kind of thing in time.
Wrapping up data delivered by forms in tags one way or another has been
done as long as XML has been around (actually longer).
As you can see, this opens entire landscapes of issues that are broader
than XSL. IMO, XML authorship will remain an issue as long as people want
perfect transparency in their media (which they do, or at least think they
do). (The true artist knows that transparency is achieved through
arrangements of the opaque; but that's a different issue.)
Cheers,
Wendell
======================================================================
Wendell Piez mailto:wapiez@xxxxxxxxxxxxxxxx
Mulberry Technologies, Inc. http://www.mulberrytech.com
17 West Jefferson Street Direct Phone: 301/315-9635
Suite 207 Phone: 301/315-9631
Rockville, MD 20850 Fax: 301/315-8285
----------------------------------------------------------------------
Mulberry Technologies: A Consultancy Specializing in SGML and XML
======================================================================