Subject: RE: Future of DSSSL: What about PDF? From: "Didier PH Martin" <martind@xxxxxxxxxxxxx> Date: Sat, 6 Mar 1999 15:15:08 -0500 |
Hi Avi <YourComment> On Saturday, March 06, 1999 06:56, Carlos Villegas [SMTP:cav@xxxxxxxxxxxxxx] wrote: > > As I understand the page-sequence FOs are not currently implemented > by > Jade. Though, James has said, implementing the front end part of it > should be relatively easy, however no backend can currently support > the DSSSL page model. > > Where can I get some examples on using the page-sequence FOs? It's > difficult for me to understand how they should work from the spec. > And since there are no tools to play with... > If you have any specifics in mind, post, though I believe there is only one person on this mailing list who can give authoritative answers. It would be interesting to see how others interpret the scriptures. > I thought about attempting to design such a formatter some time ago. > > However, it's been only that, just an attempt! Was it an actual attempt? or just a thought? :-) > One of the key points, of course, is to implement the DSSSL > page model and the synchronization flow objects from the start, > then a simple-page-sequence would be implemented in terms of the > more general page-sequence. There's no gain on not implementing > the page model, we already have that. > However, after reading the spec several times about the page > model, I need some practical examples, to make sure I have > the correct interpretation. > </YourComment> <Reply> In fact my own opinion on the subject (and also because I have to live with jade code maintenance and modifications) is that what is missing is real versatile grove objects. When you look closely to what a dsssl engine is doing: a) it parse the SGML/XML document and construct a grove b) from this grove it (theoretically) construct a new one. This latter is a FO grove contrary to the former which is a document grove. So, basically we have to deal with groves. Most grove interfaces that I saw makes this equation "concept object" = "code object" instead on creating a general mechanism based on good software practice like implementing a grove with the composite pattern (gamma & al.). I'll explain. This brings us the result that we have to multiply entities. If you look at a document's hierarchy or the concept of a grove, it is simply a hierarchy of objects and to each object is associated a property set. The composite design pattern is simply that all objects inherit from a basic object used to manipulate a collection or implement an interface to manipulate a collection. Thus, for example, imagine an object having methods to manipulate a collection of objects and therefore you can say that this object is implicitly a collection of objects. Usual collection methods could be implemented in this object like: - add an element (the element is an other object and therefore we can create a tree) - remove an element - delete an element - update an element - find or get an element Then if you define this object as an interface (with similar constructs in Java or C++ languages - virtual members, or CORBA , ILU or DCOM - Interface definition). For example, if you defined it with a CORBA, ILU or DCOM IDLs you can map this interface to several languages and thus implement the interface with a particular language. Any client having to interface to the grove would do it through this interface. With the right object middleware, the client could be implemented in any language. Thus, either DCOM, ILU or CORBA are good candidate for such interface definition. OK now we have a tree of objects and methods to manipulate the grove. For the document grove each object is equivalent to a markup and for the Formatting object grove each object correspond obviously to a formatting object :-) However, something is missing for both groves: the property set. Either for the document grove or the formatting object grove each object is associated to a property set. Then we need a new kind of object: a property set with methods like: - Add a property - remove a property - delete a property - update a property - find or get a property We have the same pattern here. A collection pattern. However this is not the composite pattern because each collection object is not itself a collection object. Property sets are not hierarchies. The main difference is that the grove object contains other grove objects and therefore is structured as a hierarchy and property set contains property values and therefore do not makes a tree. We then have something like (let's try a graphical representation here :-) object........................ property set |____ object................ property set |____ object................ property set |___ object ........ property set Thus, grove object have methods to manipulate the object's hierarchy and property set methods to manipulate the properties collection and set/get properties values. so, in the previous property set interface we have not included general property set get/set values methods so let's correct this situation: - Add a property - remove a property - delete a property - update a property - find or get a property - get a property value - set a property value Here is the real advantage of such interface. with a minimal set of methods you can have get or set any property, member of a property set. As an example, let's say that the grove object is a formatting object and that we want to set the "Font-size" property, then we would make a call like PropertySet.Put("Font-Size", Font_size_value) and get with: Font_size_value = PropertySet.Get("Font-Size") Off course the interface to language mapping could differ for any particular language. The real advantage of this kind of interface is what I would call the occam razor concept (don't create entities ad finitum!). You don't have to create a new object for a new kind of conceptual object, the interface is general enough so any conceptual objects could be mapped to language implemented objects. You can also provide collection enumerators to browse a particular collection. In fact, we created this kind of grove object and I'll introduce these grove objects that could be used to manipulate any "grove concept" such as a directory service or a structured document (both share the same structure and both could be mapped to the grove concept). We mapped, at first the object with DCOM interface which I should recall is not Microsoft solely property but could be also available on Unix platforms from the Opensource consortium (http://www.opengroup.org). The OpenSource is a consortium with rules similar to W3C and membership is based on yearly fees like W3C is also. Workgroup could be formed and specs published by the workgroup. Like W3C it is not as open as IETF groups and its specs elaboration is restricted to members. We are also working to implement "general grove" interface in Java, ILU and CORBA and I'll post in a near future a document explaining this. Contrary to W3C DOM, it is not restricted solely to documents but can also be used for other hierarchical constructs like system directories (EX: LDAP, NDS) So, first we have the "Grove Object" supporting the IObject interface. And because it is an interface, it can be implemented into any language. Concretely speaking with DCOM it could be easily implemented in C++, VB, Delphi (Pascal)and Java. To map it to other languages would require to define the interface with ILU which can be mapped to languages like scheme. ILU is made by Xerox and is freely available (includeing source code) interface IObject : IUnknown { HRESULT ParseDisplayName([in]BSTR bstrPath, [out]ULONg* pchEaten, [in]REFIID riid, [out] LPVOID* ppvObj); HRESULT AddComponent ([in]BSTR bstrKey,[out, retval]IObject** ppObject); HRESULT RemoveComponent ([in]BSTR bstrKey) HRESULT GetComponent ([in]BSTR bstrKey, [out,retval] IObject** ppObject); HRESULT get_Count (ULONG uCount); HRESULT get_Name ([out, retval]BSTR* pbstrName); HRESULT put_Name ([in]BSTR bstrName); HRESULT get_Parent ([in]REFIID riid, [out, retval] LPVOID *ppObject); HRESULT put_Parent([in] LPOBJECT parent); }; ParseDisplayName takes as input a display name representing a hierarchy element and return an element of the hierarchy. Each object is part of a particular name space. For example, we create a hierarchical name space based on the URN - URC schemas and named TNS. So to get a particular object you would do a call like: GroveObject = Document.ParseDisplayName("urn:tns:MyDocument/Chapter(1)/Paragraph(2)" and get the second paragraph of the first document's chapter. you can use IDs instead of numbers (if markups includes IDs). Someone could choose to implement the interface with the XPointer name space instead of this one. The idea is that request are made with string like we do with URL. The main difference is that, in this case we call that a Universal Resource Name and is location independant (URL are not) To each URN is associated a Universal Resource Characteristic (URC)which is equivalent to a property contained in a property set. a URL is,in this case, just a particular URC or property. Thus, each grove object is uniquely identified with a URN (Universal Resource Name - RFC 2141) and a grove object as a single property set object associated to it. Each property part of the property set is also called a URC (Universal Uniform Characteristic) We have just merged the IETF name space concept with grove object queries. we can also merge with directory services name space concept. It is plausible to envision that a grove could be mapped on a LDAP or NDS name space. Thus, a grove object part of a LDAP name space would be defined with a display name such as: GroveObject = Document.ParseDisplayName("LDAP://MyDomain.com/paragraph=1,chapter=2,documen t=MyDocument") Of course LDAP actual implementation do have the concept of non-typed object and require a specific shema for each contained object (idem for NDS) this conduct to two choices: a) create a schema for each DTD object, b) create a LDAP implementation that support non-typed objects. Anyway the point is not here. What we should retain is that an grove object is also included in a name space, more particularly a hierarchical name space. That each object is thus uniquely indentified with a URN and each property is a URC. Then other members are self explanatory: - AddComponent = add a grove object to the collection (as a sub grove object - see previous figure) - removeComponent = remove a grove object from the collection - Get_count = return the number of grove objects contained in the collection - get_Name = return the grove object's name (or name space particular context) - put_Name = set the grove object's name - get_Parent = get the grove object's parent which is also a grove object. - put_Parent = set the grove object parent which should also be a grove object. When you create a grove object you also create its associated property set object. DCOM allows you to define multiple interfaces (this is particular today to DCOM but CORBA 3 will allows multiple interface too - Mozilla group's XPCOM also support the notion of multiple interfaces). So to obtain a property set interface you do a queryinterface to the grove object and this latter returns a IPropertySet interface which is defined as: interface IPropertySet: IDispatch { HRESULT AddProperty([in]BSTR bstrKey, [in]VARIANT Value); HRESULT RemoveProperty([in]BSTR bstrKey); HRESULT GetProperty([in]BSTR bstrKey, [out, retval]VARIANT* Value); HRESULT ModifyProperty([in]BSTR bstrKey, [in]VARIANT Value); }; If I map this to EcmaScript to set or get a property this would be: FontSize = groveObject.GetProperty("Font-Size"); groveObject.ModifyProperty("Font-size", FontSize); - AddProperty = Add a new property to the property set (ex: groveObject.AddProperty("Font-size") ) - RemoveProperty = remove a property form a property set (ex: groveObject.removeProperty("Font-size") ) - GetProperty = Get a particular property from a property set (ex: FontSize = groveObjet("Font-size") ) - ModifyProperty = set a particular property from a property set (ex: groveObject.ModifyProperty("Font-size", FontSize) ) to enumerate both collection, enumeration interfaces are provided: interface IEnumObject : IUnknown { HRESULT Next ([in]ULONG celt, [out, retval]LPOBJECT* rgelt, [in]ULONG* pceltFetched); HRESULT Skip ([in] ULONG celt); HRESULT Reset( void); HRESULT Clone([out, retval]IEnumObject** ppenum); }; interface IEnumProperty : IUnknown { HRESULT Next ([in]ULONG celt, [out, retval]BSTR* rgelt, [in]ULONG* pceltFetched); HRESULT Skip ([in] ULONG celt); HRESULT Reset( void); HRESULT Clone([out, retval]IEnumPropertyt** ppenum); }; Thus in VBscript you could enumerate a grove object collection (i.e equivalent to children-object) - Sorry if I take VB script instead of ECMA script. it is only that it is easier with the former. for each groveObject in ParentGroveObject do sometning next groveObject Here it is. When I was still in the research center (before having no life in our company startup :-), I noticed the conceptual similarities between: a) the composite pattern which is used to manipulate whole-part structures b) directory services like LDAP or NDS which are hierarchical structures c) structured documents which are also hierarchical structures In fact, it is a common reflex for humans to chunck things (probably because of the constrains of our short term memory - ref: "the magic seven" or also known as the Miller principle (1956) - our short term memory can process simultaneously 7 elements +- 2). Thus, the whole part structure is a basic reflex for us to chunck things into hierarchies or whole-parts constructs. Up to now, most grove or DOM implementation got the reflex to map a conceptual object to a language object and didn't use the generic Composite design pattern. A design pattern is simply a "best design practice" and this best pratice seems to be ignored from most grove or whole-parts constructs. It is also because most of these design are based on a particular language instead of an interface. Also, because OCCAM is not there to remind not to create entities ad-finitum :-). However, the intend was also to let the languge validate the types. The cons: you have to create an object for each conceptual object and then multiply entities. Because of this, you cannot use the "processingInstruction" object for a directory service "organizationalUnit" object. We have the benefit of type validation but loose the advantage of simplicity an a simple interface that could be used to browse diverse objects. For example, our groveobject interface could be used to browse directories from a highlevel object such as Country down to a particular paragraph in a specific document. to do so with strongly type objects requires too much complexity (And remember Occam don't like that we create entities ad finitum!). If you use a composite pattern type of interface you soon discover that it is easy to remember, easy to use and quite powerful. DCOM and, in a near future, CORBA allow to create generic interfaces for the composite patern interface definition. Especially, the Queryinterface as used in DCOM and XPCOM removes the constrains of strict inheritance and allows objects to present or not certain interfaces based on a particualr context (no, you cannot do that with strict inheritance). Thus, this mechanism allows you to create objects with multiple personalities or facets (inheritance force a IS_A relationship for the inherited characteristics, the mutltiple interface mechanism allows a COULD_BE relationship based on the client type - as an example: a client may have rights to get access on to certain interfaces in a particular context such as groveobject enumeration if there are sub objects or none if there are no sub objects). Thus, the object you are talking about is in fact a formatting objects grove. What is missing is good interfaces for this grove object and preferably language independant and platform independant interfaces. DCOM ILU, and CORBA make good candidates for this. I hope I gave you a good response and spoke with enough autority :-) Regards Didier PH Martin mailto:martind@xxxxxxxxxxxxx http://www.netfolder.com DSSSList info and archive: http://www.mulberrytech.com/dsssl/dssslist
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: Future of DSSSL: What about PDF, Carlos Villegas | Thread | Re: Future of DSSSL: What about PDF, Carlos Villegas |
Re: Future of DSSSL: What about PDF, Brandon Ibach | Date | RE: Future of DSSSL: What about PDF, Didier PH Martin |
Month |