Subject: [xsl] RE: for roger Glover..., Knowledge management XML From: Jinesh Varia <jineshresearch@xxxxxxxxx> Date: Mon, 10 Feb 2003 17:04:52 -0800 (PST) |
Hello, I have included the final code for others to experiment. This is interesting problem of matching the two XML data sheets to get one correct one. the Knowledge mangement aspect with regards to the XSL sheet which has Person names who are authors of publications. If I have a knowledge XML of say <author>Micheal Kay</author> is same as <author>M. Kay</author> in one xml data sheet in the form: <samePersons> <author>Micheal Kay</author> <!-- the actual correct one that I want in database--> <author>Micheal</author> <author>Micheal K.</author> </samepersons> I have a seperate xml data sheet. that simply has all the "knowledge" mentioned. how can I sort/delete the error names for my current XML, which is <person id="0003"> Micheal Kay </person> I hope I am explaining you properly. I have one XML data sheet which has knowledge of which ones aer right and which ones are wrong names. I want to delete the errornous elements in my main XML sheet so that only the correct names are shown. Also, if I delete the errornoues elements, I have put the correct id in the pubper element also. Suggest whether should I do this when I am generating the ids (XSL sheet show below) or after I generate the ids in a seperate XSL. Jinesh ----------------------------------------------- final code: <xsl:transform version="1.1" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="xml" indent="yes" xmlns:xalan="http://xml.apache.org/xalan" xalan:indent-amount="4" /> <xsl:variable name="persons"> <xsl:apply-templates select="//publication/author[not(.=preceding::author or .=preceding::editor)]|//publication/editor[not(.=preceding::author or .=preceding::editor)]" mode="generate-person"/> </xsl:variable> <!-- Similar to original "generate-author-id" template, but generates entire person element--> <xsl:template match="author|editor" mode="generate-person"> <xsl:if test="normalize-space(.)"> <!-- this is to prevent any emply author/editor elements to get ids --> <xsl:variable name="temp" select="concat('000000000',position())" /> <xsl:variable name="perid" select="substring($temp,string-length($temp)-9)"/> <person perid="{$perid}"> <personname> <xsl:value-of select="."/> </personname> </person> </xsl:if> </xsl:template> <xsl:template match="dblp"> <dblp> <!-- copies the "person" elements result tree fragment into the result tree --> <xsl:copy-of select="$persons"/> <xsl:apply-templates select="publication"/> </dblp> </xsl:template> <xsl:template match="publication"> <!-- Same as in the original code --> <publication> <xsl:copy-of select="@*|*[not(self::author or self::editor)]"/> </publication> <!-- calls template to create "pubper" elements, one per publication per pub author --> <xsl:apply-templates select="author|editor"/> </xsl:template> <!-- creates "pubper" elements --> <xsl:template match="author|editor"> <xsl:if test="normalize-space(.)"> <pubper> <!-- gets "pubid" from parent --> <pubid> <xsl:value-of select="../@pubid"/> </pubid> <!-- gets "perid" from "$persons" variable --> <perid> <!-- Note that in XSLT 1.0 a result tree fragment like "$persons" does not automatically convert to a node set. Therefore most processors provide an extension function for that purpose (like "xalan:nodeset()" below) --> <xsl:value-of xmlns:xalan="http://xml.apache.org/xalan" select="xalan:nodeset($persons)/person[current()=personname]/@perid" /> </perid> <persontype><xsl:choose><xsl:when test="node()=self::editor"><xsl:text>2</xsl:text></xsl:when><xsl:otherwise><xsl:text>1</xsl:text></xsl:otherwise></xsl:choose></persontype> </pubper> </xsl:if> </xsl:template> </xsl:transform> --- Roger Glover <glover_roger@xxxxxxxxx> wrote: > Jinesh Varia wrote: > > > Are you some kind of XML jini! > > Far from it. Just ask the *real* regulars. :-) > > > > thank you very much. I am entangled in this XSL > > programming since two weeks and you solved it like > in > > a blink. > > You were most of the way there, you just needed one > key insight. It would > have taken me somewhat longer to write this starting > with just an idea. > > > > But there are some serious issues here: > > > > With your approach of generating perids before the > > actual seperation of publication, person, pubper > > elements, I feel it would not work when I have > 500,000 > > author elements. I have an 130MB XML sheet which > > contains almost 350,000 publication elements > > I know you did not knew about this. Can you please > > comment on this. > > > > Do you think I am right on this? Please correct > me. > > I chose this solution not because it was the most > efficient, but because it > was the most direct route I could find from where > you were to where you > wanted to be. > > Right now it would probably behoove you to spend > some time with the FAQ, the > spec and other reference resources (I like Michael > Kay's "XSLT Programmer's > Reference"), studying the syntax and usage of the > "<xsl:key>" element and > the "key()" function. You should then also look up > and study any FAQ > reference to Muenchian grouping. > > > > Now there are also editors along with authors. > Authors > > can be editors also for some publication. means > > <author>Steve Lawyer</author> for pub1 can be > > <editor>Steve Lawyer</editor> for pub2. but we > want > > to have single person element generated. While in > > <pubper> we have <persontype> (1 for author, 2 for > > editors) hence in our example for pub1, it shoud > be > > <persontype>1</persontype> and for pub2 it should > be > > <persontype>2</persontype> > > how can we store that information with your code > then? > > we have to get unique person names > > Match "author | editor" instead of just "author", > and use either "<xsl:if>" > or "<xsl:choose>" + "<xsl:when>" to choose between > persontype "1" (author) > and persontype "2" (editor). Likewise, the "select" > expression on > "<xsl:apply-templates>" in the "persons" variable > *would* have to become > much more complicated. However, if you change to > keys and Muenchian > grouping, the expression will be much simpler. > > > > You dont have a clue How much your code has helped > > me!!! I have been working on this since two > weeks... > > thanks, roger. thank you > > You are very welcome. Glad to help. :^) > > Let us know if you get stuck, or when you have a > final version. > > > -- Roger Glover > glover_roger@xxxxxxxxx > > > > XSL-List info and archive: > http://www.mulberrytech.com/xsl/xsl-list > ===== ----------------------------------------------------------------- Jinesh Varia Graduate Student, Information Systems Pennsylvania State University Email: jinesh@xxxxxxx ----------------------------------------------------------------- 'Self is the author of its actions.' __________________________________________________ Do you Yahoo!? Yahoo! Mail Plus - Powerful. Affordable. Sign up now. http://mailplus.yahoo.com XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
RE: [xsl] generating id by calling , Roger Glover | Thread | Re: [xsl] Are variables allowed in , David Carlisle |
Re: [xsl] not matching empty text n, Joerg Heinicke | Date | RE: [xsl] Splitting a string on wor, thei |
Month |