Subject: [xsl] converting Word dictionary to FLEx From: "Jim Albright jim_albright@xxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> Date: Tue, 18 Sep 2018 21:47:37 -0000 |
Using Saxon 9.8.0.12 in Oxygen Style sheet version="2.0" Problem domain is getting a dictionary created in Word with only <p>s, <span>s and <b>, and <i> along with some color added to some spans. In plain text it looks like: #-a (dem. adj. of proximity) variant of -ad #a-1(+a.f./i.a. verb) 1. so, in order that perhaps <D2> 2. (particle introducing a.f., indicating `near future' or `future possibility') <Asp1.19> <D2> Variant Forms: ad-(+a.f./i.a. verb) (in 1st person singular and third person plural) 1. so, in order that perhaps 2. (particle introducing a.f., indicating `near future' or `future possibility') riI# ad-ftuI# I want to go. ira a-t-ia:r He wants to see it. a-ka-(+a.f.) if only a(d)-ur-(+a.f./i.a.) so, lest, in order that perhaps not (also introduces neg. imp.: "Do not...") a-ur-imil-(+a.f./i.a.) perhaps, in order that, in the hope that; lest, maybe it would happen that ad-ukJ7an- (+a.f./i.a.) 1. when, as soon as <Asp1.24> <Na3.10.6> 2. just, repeatedly <Na3.16.2> ad-ur- (+a.f./i.a.) so, lest, in order that perhaps not (also introduces neg. imp.: "Do not...") Variant Forms: ad- Turn this into a flat file suitable to import into a dictionary processing program called FLEx. Something like: \lx -a \gi (dem. adj. of proximity) \vao -ad \lx a- \hm 1 \co (+a.f./i.a. verb) \sn 1 \de so, in order that perhaps \so <D2> \sn 2 \gi (particle introducing a.f., indicating `near future' or `future possibility') \so <Asp1.19> \so <D2> \sh Variant Forms: \va ad- \co (+a.f./i.a. verb) \gi (in 1st person singular and third person plural) \sn 1 \de so, in order that perhaps \sn 2 \gi (particle introducing a.f., indicating `near future' or `future possibility') \xv riI# ad-ftuI# \xe I want to go. \xv ira a-t-ia:r \xe He wants to see it. \va a-ka- \co (+a.f.) if only \va a(d)-ur- \co (+a.f./i.a.) \de so, lest, in order that perhaps not \gid (also introduces neg. imp.: "Do not...") \va a-ur-imil- \co (+a.f./i.a.) \de perhaps, in order that, in the hope that; lest, maybe it would happen that \va ad-ukJ7an- \co (+a.f./i.a.) \sn 1 \de when, as soon as \so <Asp1.24> \so <Na3.10.6> \sn 2 \de just, repeatedly \so <Na3.16.2> I have processed the html output from word into the following snippet: \entry_number 00001 \lx -a \vernacular FALSE \grammatical_info dem. adj. of proximity) \variant_of -ad \entry_number 00002 \lx a- \hm 1 \vernacular FALSE \co (+|ga a.f.|r |ga i.a.|r verb) \senseStart 1 \definition so, in order that perhaps \source D2 \senseStart 2 \grammatical_info particle introducing |ga a.f.|r , indicating `near future' or `future possibility') \source Asp1.19 \source D2 \sectionHead Variant Forms: \variant ad- \co (+|ga a.f.|r |ga i.a.|r verb \grammatical_info in 1|sup st|r person singular and third person plural) \senseStart 1 \definition so, in order that perhaps \senseStart 2 \grammatical_info particle introducing |ga a.f.|r , indicating `near future' or `future possibility') <<<<<< above is correct \example riI#I want to go. <<<<<< what I get \example iraHe wants to see it. \example riI# ad-ftuI# <<<<< what I am looking for. I need two more words here. ad-ftuI# \translation I want to go. \example ira a-t-ia:r \translation He wants to see it. The exact slash codes are not important. Getting ALL the data across is. I have only added the Arial class so far on this instead of <span style="font-family:"Arial",sans-serif" lang="EN-GB"> it is <span class="Arial"> I am starting with this snippet of code in HTML. <p> ... <span class="Arial">verb) (in 1<sup>st</sup>person singular and third person plural) <br />1. so, in order that perhaps <br />2. (<i>particle introducing a.f., indicating `ne ar future' or `future possibility'</i>) <br /> </span> <span class="MsoHyperlink"> <b> <span lang="EN-GB">riI#</span> </b> </span> <b> <span lang="EN-GB">ad-</span> <span class="MsoHyperlink"> <span lang="EN-GB">ftuI#</span> </span> </b> <span class="Arial">I want to go.<br /> ..... </p> My guess so far is to match the <br/> and then look for <b> words following but donbt include <b> after <span class="Arial" that turns into \translation . <xsl:template match="html:br"> <xsl:element name="span"> <xsl:attribute name="class">example</xsl:attribute> <xsl:value-of select="following::html:b"/> <<<<<<<<<<<< this gives too many </xsl:element> </xsl:template> I hold the slash code in the class attribute until the last step. That way I can continue working on the file in XML. How do I restrict the <xsl:value-of select="following::html:b"/> to just the ones before the next <span class="Arial">I want to go.<br /> Thank you Jim Albright 704-562-1529 unlimited cell Wycliffe Bible Translators
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] Group and change heading , Wendell Piez wapiez@ | Thread | Re: [xsl] converting Word dictionar, Michael Kay mike@xxx |
Re: [xsl] Group and change heading , Wendell Piez wapiez@ | Date | Re: [xsl] converting Word dictionar, Michael Kay mike@xxx |
Month |