[xsl] RE: Applying XSL transformation to non-xml (but fixed structure) file

Subject: [xsl] RE: Applying XSL transformation to non-xml (but fixed structure) file
From: "Christian Schouten" <C.Schouten@xxxxxxxxxx>
Date: Fri, 4 Jun 2010 16:36:43 +0200
I finally got it working and it all turned out to be a lot easier than
I'd expected it to be :) Even got away without using analyze-string and
lengthy regex's :)

The key was to read in the file as unparsed-text, tokenize it by the
start element such that the header can be skipped using position, and
then re-tokenize the resulting string by quot. Given the fixed layout of
the file I could simply state that fields 6 and 8 were to be reused.
Reusable? Maybe not, but functional definitely :)

Should anyone be interested:
---
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
xmlns:xs="http://www.w3.org/2001/XMLSchema"; xmlns:fn="fn"
exclude-result-prefixes="xs fn">
 <xsl:output method="xml" indent="yes" />
 <xsl:param name="tableName">Endpoints</xsl:param>
 <xsl:param name="className">Endpoint</xsl:param>
 <xsl:param name="objectName">endpoint</xsl:param>
 <xsl:param name="conditionName">urlEndpoint</xsl:param>
 <xsl:param name="actionName">Status</xsl:param>
 <xsl:param name="filename" select="'file:/D:/DTAAR/Endpoints.drl'" />
 <xsl:param name="encoding" select="'iso-8859-1'" />
 <xsl:template match="/">
  <xsl:variable name="lines" select="tokenize(unparsed-text($filename,
$encoding), '\r?\nrule')" as="xs:string+" />
  <xsl:element name="rules">
   <xsl:for-each select="$lines[position() gt 1]">
    <xsl:element name="rule">
     <xsl:variable name="lineItems" select="tokenize(., '&quot;')"
as="xs:string+" />
     <xsl:choose>
      <!-- Standard layout -->
      <xsl:when test="count($lineItems) eq 9">
       <xsl:element name="conditionValue">
        <xsl:value-of select="$lineItems[6]" />
       </xsl:element>
       <xsl:element name="actionValue">
        <xsl:value-of select="$lineItems[8]" />
       </xsl:element>
      </xsl:when>
      <!-- Wildcard (or other single character) -->
      <xsl:otherwise>
       <xsl:element name="conditionValue">
        <xsl:value-of
select="substring-before(substring-after($lineItems[2], ''''), '''')" />
       </xsl:element>
       <xsl:element name="actionValue">
        <xsl:value-of select="$lineItems[4]" />
       </xsl:element>
      </xsl:otherwise>
     </xsl:choose>
    </xsl:element>
   </xsl:for-each>
  </xsl:element>
 </xsl:template>
</xsl:stylesheet>
---

Met vriendelijke groet,

Christian C. Schouten
-----Original Message-----
From: Christian Schouten
Sent: woensdag 2 juni 2010 12:32
To: 'xsl-list@xxxxxxxxxxxxxxxxxxxxxx'
Subject: Applying XSL transformation to non-xml (but fixed structure)
file

Hi all,

I need to apply an XSL transformation to a non-xml file that has a fixed
structure.
The goal is to read in the file, add/edit/delete a record and write it
back.

A sample file (start to finish) is as below:
===
package Endpoints;
#generated from Decision Table
import bre.Endpoint;
rule "Endpoints #1: (Endpoint.urlEndpoint =='\"https://a.b.c/d\";')"


            salience 0
            when
                        endpoint:Endpoint(urlEndpoint==
"https://a.b.c/d";)
            then
                        endpoint.setStatus("OK");
end

rule "Endpoints #2: (Endpoint.urlEndpoint =='\"https://w.x.y/z\";')"


            salience 0
            when
                        endpoint:Endpoint(urlEndpoint==
"https://w.x.y/z";)
            then
                        endpoint.setStatus("OK");
end
===

How would I best approach this? My thoughts were:
* Open file (inside a jar)
* Skip three-line header
* Use analyze-string/matching-substring to split into records defined as
something like "^rule \"Endpoints #[A-Za-z0-9:;/]*end$"
* Use string analysis functions to split into fields urlEndpoint and
Status
* Magically end up with
<Endpoints><Endpoint><urlEndpoint>https://a.b.c/d</urlEndpoint><Status>O
K</Status></Endpoint><Endpoint><urlEndpoint>https://w.x.y/z</urlEndpoint
><Status>OK</Status></Endpoint></Endpoints>
* Perform requested operation (remove item from tree, add item to tree
etc.)
* Write back changed file (inside the jar)

The file header is made up as: package $tableName;\n#generated from
Decision Table\nimport bre.$className;
Each record is made up as: rule "$tableName #1:
($className.$conditionName =='\"$conditionValue\"')"\n\t\n\tsalience
0\n\twhen\n\t\tendpoint:Endpoint(\n?urlEndpoint==
"$conditionValue")\n\tthen\n\t\t$objectName.set$actionName("$actionValue
");\nend\n\n

So far, I can come up with the theory up to splitting the file into
records that are delimited by the word 'rule' at the start of a line and
the word 'end' as its own line and I can come up with a definition for
how a record is made up from field. Actually splitting the records into
fields within XSL however is too much black magic for me right now. If
anybody could share his/her thoughts that'd be most appreciated...

Best regards,

Christian C. Schouten

Current Thread