Subject: Re: [xsl] Grouping of text input file lines From: Michael Kay <mike@xxxxxxxxxxxx> Date: Sun, 11 Aug 2013 18:44:52 +0100 |
I've generally done this using your second approach: convert each line to an element and then use group-starting-with to group them. In XSLT 3.0 we're allowing patterns to match atomic values, so you can do group-starting-with on a sequence of strings. Michael Kay Saxonica On 11 Aug 2013, at 15:46, Wolfgang Laun wrote: > I'll briefly describe the problem and outline two approaches to a > solution. I'd be pleased to receive a comment or two. > > The task is to convert a plain text file to XML using XSLT 2.0. The > text file contains lines, all according to > tag: value > and these lines are grouped at three levels: "database", "relation" > and "field", where each entity has some options and one or more > children of the lower level (except for field, of course). > > Example, indentation according to nesting level: > > node: abc # a DB option > key: CMOS # a DB option > rel: rlo_one > com: a relation # a relation option > alg: direct # a relation option > ele: fa int > com: blurb # element (field) options > def: 0 > acc: px > acc: py > ele: fb chars > com: bla bla > def: "----" > alg: permute > num: 100 # a relation option > rel: rlo_two > com: another relation # a relation option > com: more comment > com: yet more comment > ele: fx int > com: blurb > def: 0 > acc: px > ele: fy int > com: bla bla > def: 42 > num: 50 # a relation option > > The expected XML structure is obvious, I think: a sequence of DB > options and relation elements; these contain relation options and > field elements, which contain field options. Field order must not be > changed. "com" entries should be joined while observing line breaks, > and "acc" entries too, but joined with a space. > > The first basic idea I used throughout is to maintain another string > sequence in parallel to the one containing the text lines. That > sequence contains just the tags, so that index-of can be used to > compute "interesting" line numbers. This way, subsequences of lines > for all or individual relations and fields can be conveniently > extracted. > > The second idea is to use grouping. The sequence of lines is converted > to a sequence of nodes <tag>value</tag> and a nested > group-starting-with separates relations and fields - almost. As you > can see, there's some leading lines defining DB options, and each > relation contains option lines before and after the element groups. > Most likely, cherry-picking lines and line groups prior to the > glorious for-each-group has to be done using the technique described > above. > > Any better ideas? > Thanks
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] Grouping of text input fi, Michael Müller-Hille | Thread | [xsl] global filter, Geert Bormans |
Re: [xsl] Deepening a flat structur, Kevin Brown | Date | RE: [xsl] Deepening a flat structur, Rick Quatro |
Month |