Re: [xsl] Applying XSL transformation to non-xml (but fixed structure) file

Subject: Re: [xsl] Applying XSL transformation to non-xml (but fixed structure) file
From: Dimitre Novatchev <dnovatchev@xxxxxxxxx>
Date: Wed, 2 Jun 2010 06:00:41 -0700
On Wed, Jun 2, 2010 at 3:32 AM, Christian Schouten
<C.Schouten@xxxxxxxxxx> wrote:
> Hi all,
>
> I need to apply an XSL transformation to a non-xml file that has a fixed
> structure.
> The goal is to read in the file, add/edit/delete a record and write it
> back.

The FXSL library provides a complete, generic LR-1 parser and a tool
(modified YACC) that from the BNF rules for an LR-1 language produces
a set of rules and tables in XML format for the generic parser to use.

I have used these to create a range of parsers -- from toy arithmetic
expressions, to JSON to XPath 2.0. The kind of effort involved is
comparable with the one using YACC.

I have provided an example of using the function f:lrParse() to parse
a JSON instance in my blog:

  http://dnovatchev.spaces.live.com/blog/cns!44B0A32C2CCF7488!367.entry

f:lrParse can be downloaded with FXSL or can be viewed here:

  http://fxsl.cvs.sourceforge.net/fxsl/fxsl-xslt2/f/func-lrParse.xsl?view=mar
kup&sortby=date

The parser for JSON can be downloaded with FXSL or   or can be viewed here:

   http://fxsl.cvs.sourceforge.net/viewvc/fxsl/fxsl-xslt2/f/func-json-documen
t.xsl?revision=1.11&view=markup&sortby=date

The parsing tables for JSON, generated by YACCX can be downloaded with
FXSL  or can be viewed here:

  http://fxsl.cvs.sourceforge.net/viewvc/fxsl/fxsl-xslt2/f/parseTables-Jason.
xml?revision=1.1&view=markup&sortby=date

YACCX can be downloaded as part of FXSL or can be viewed here:

  http://fxsl.cvs.sourceforge.net/viewvc/fxsl/fxsl-xslt2/Tools/YACCX/?sortby=
date


--
Cheers,
Dimitre Novatchev
---------------------------------------
Truly great madness cannot be achieved without significant intelligence.
---------------------------------------
To invent, you need a good imagination and a pile of junk
-------------------------------------
Never fight an inanimate object
-------------------------------------
You've achieved success in your field when you don't know whether what
you're doing is work or play



>
> A sample file (start to finish) is as below:
> ===
> package Endpoints;
> #generated from Decision Table
> import bre.Endpoint;
> rule "Endpoints #1: (Endpoint.urlEndpoint =='\"https://a.b.c/d\";')"
>
>
> B  B  B  B  B  B salience 0
> B  B  B  B  B  B when
> B  B  B  B  B  B  B  B  B  B  B  B endpoint:Endpoint(urlEndpoint==
> "https://a.b.c/d";)
> B  B  B  B  B  B then
> B  B  B  B  B  B  B  B  B  B  B  B endpoint.setStatus("OK");
> end
>
> rule "Endpoints #2: (Endpoint.urlEndpoint =='\"https://w.x.y/z\";')"
>
>
> B  B  B  B  B  B salience 0
> B  B  B  B  B  B when
> B  B  B  B  B  B  B  B  B  B  B  B endpoint:Endpoint(urlEndpoint==
> "https://w.x.y/z";)
> B  B  B  B  B  B then
> B  B  B  B  B  B  B  B  B  B  B  B endpoint.setStatus("OK");
> end
> ===
>
> How would I best approach this? My thoughts were:
> * Open file (inside a jar)
> * Skip three-line header
> * Use analyze-string/matching-substring to split into records defined as
> something like "^rule \"Endpoints #[A-Za-z0-9:;/]*end$"
> * Use string analysis functions to split into fields urlEndpoint and
> Status
> * Magically end up with
> <Endpoints><Endpoint><urlEndpoint>https://a.b.c/d</urlEndpoint><Status>O
> K</Status></Endpoint><Endpoint><urlEndpoint>https://w.x.y/z</urlEndpoint
>><Status>OK</Status></Endpoint></Endpoints>
> * Perform requested operation (remove item from tree, add item to tree
> etc.)
> * Write back changed file (inside the jar)
>
> The file header is made up as: package $tableName;\n#generated from
> Decision Table\nimport bre.$className;
> Each record is made up as: rule "$tableName #1:
> ($className.$conditionName =='\"$conditionValue\"')"\n\t\n\tsalience
> 0\n\twhen\n\t\tendpoint:Endpoint(\n?urlEndpoint==
> "$conditionValue")\n\tthen\n\t\t$objectName.set$actionName("$actionValue
> ");\nend\n\n
>
> So far, I can come up with the theory up to splitting the file into
> records that are delimited by the word 'rule' at the start of a line and
> the word 'end' as its own line and I can come up with a definition for
> how a record is made up from field. Actually splitting the records into
> fields within XSL however is too much black magic for me right now. If
> anybody could share his/her thoughts that'd be most appreciated...
>
> Best regards,
>
> Christian C. Schouten

Current Thread