Subject: Re: [xsl] grouping + global variable (?) (was re: regexs, grouping (?) and XSLT2?) From: Deirdre Saoirse Moen <deirdre@xxxxxxxxxxx> Date: Fri, 13 Aug 2004 20:05:26 -0700 (PDT) |
On Fri, 13 Aug 2004, Wendell Piez wrote: > Your project sounds very ambitious. Up-conversion is a challenging and > fascinating business, which we're all going to learn much more about. > You have several conference papers' worth of material here, I bet. I'm hoping so. Quite frankly, I hadn't realized we were so cutting edge. :) Ultimately, my goal is to provide an application that offers integration between the text file (written using the user's text processor of choice). User wants to submit a manuscript, then the application performs all the necessary generation of the document (including cover letter) using user-specific information about how they want the document to appear, including any market- or genre-specific styles. Press a button, out pops the PDF or RTF. For now, I'll settle for PDF. :) I'd already written the submission manager and am trying to work to integrate the work of another person into the project. Thus my struggle to understand. > At 08:15 PM 8/12/2004, you wrote: >> But I've been thinking, based on the comments from the list, that a >> better process might be eliminating the perl script entirely. > > Maybe: but you'll need something at least as good to do the work it's > doing, and Perl is really good at regular-expressions and string processing > generally. > > (Personally I might have tried it in Python, but that's mainly because I > can count the lines of Perl I've written in my life on one hand. Of course, > I can count in binary on my hands, which gets me higher than five.) I didn't write the perl script, thus my frustration (as a Python person). My partner-in-crime and I have come at the problem from entirely different directions. > Now it has some regexp support, XSLT 2.0 should be at least a credible > option here, but its features have yet to be stress-tested TMK and > tools support is still somewhat up in the air. (I believe Mike Kay is > speaking on this very topic at XML 2004 this November in Washington > DC.) OK, that's what I'd been beginning to understnad based on list comments. I wasn't aware of the tool support problem. > A split-down-the-middle option could be to write a little function > library in the language of your choice to do the upconversion > string-processing, and call out to it from your XSLT using extension > functions. (This is what I kind of imagined would happen five years > ago, but it turns out processor-dependent extension functions are > unfashionable these days.) This is an intriguing option. 99% of the problem comes from documents saved in the native platform that aren't correctly tagged. I'm not quite certain what to do about this so that the editing is transparent. Yet. I feel moderately confident that this might make it a more contiguous process, which would also require fewer installed pieces in order to work. > >I'm not sure I'd > >want to eliminate the intermediate XML file, though. > I think having the intermediate format will prove to be good design in > any case. OK. > >Option 3 seems to be ruled out based on my current toolchain > >(apache-FOP), which probably eliminates #2 as well. (I could easily be > >wrong on this) > > Apache Xalan-J has support for a node-set function, so you could use > option 2 if you wanted. It will even recognize it in the exslt.org > namespace, which is nice. Neat. > >So, my question (you knew there was one): can someone give me a > >description of how to accomplish #4, given the workflow I've got, using > >something like Saxon? I see that it's an XSLT processor, but I'm don't get > >the map of how all the pieces fit together. Right now, I know (after > >having looked) that I'm using xalan for the simple reason that it came > >with my apache-fop install. > > Saxon is well-liked by developers (it runs well, it's conformant, and > it has good error messages), and can be switched in for Xalan in your > toolchain if you prefer it. Saxon also supports exslt:node-set, so you > can use option #2 with it as well. Well, I can see if it offers me more options. I know enough to figure out how to wrest it into the toolchain. > As I mentioned, it has an extension attribute, saxon:next-in-chain, that > can be invoked for pipelining. IIRC it passes SAX events between processor > invocations (Mike?), so it's much faster than writing a file and reparsing, > though perhaps not quite as fast as passing unserialized trees, as options > 2 and 3 would do. Right now, I'm running a script daily that re-generates XML files from any changed text files in a given directory tree. The generation of a PDF is upon-request, with re-generation of XML if it's needed. So part A (txt->xml) doesn't necessarily happen when part B (xml->pdf) does. Nevertheless, you've given me another idea, which I'll try over this weekend. > I am reasonably sure Xalan offers similar features, however, or the Cocoon > framework does. Cocoon seems very interesting, but I don't quite get where it fits into the overall picture of things, though I am reading up on it. > >I'd also eventually like to get a decent RTF output. Standard manuscript > >prose is not terribly complex, so something that supported basic features > >should suffice for that. Unfortunately, the commercial options are too > >expensive for the intended audience. Is jfor likely to be my best > >available option? > > I'd be interested to hear myself from the list on this question. I haven't > yet myself seen a really nice route to RTF. I think two passes to this > (analogous to the way IBM deployed a "TeXML" which could be targeted as a > route to TeX) might be the best way to do it: have yet another tag set that > describes only the formatting primitives supported by RTF and a utility > stylesheet to make RTF out of that. Or use XSL-FO, if any of the formatters > can make decent RTF yet. jfor hasn't been updated at all in over a year, so it seems like a dead project. And jfor.org is down. I should add that I *do* need API access rather than a standalone application. -- _Deirdre web: http://deirdre.net blog: http://deirdre.org/blog/ yarn: http://fuzzyorange.com cat's blog: http://fuzzyorange.com/vsd/ "Memes are a hoax! Pass it on!"
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] grouping + global variabl, Wendell Piez | Thread | Re: [xsl] grouping + global variabl, Wendell Piez |
Re: [xsl] Problem transforming the , IceT | Date | [xsl] namespaces and temporary tree, Bruce D'Arcus |
Month |